Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13507

Add stable hash to SkyMap objects

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: skymap
    • Labels:
    • Story Points:
      2
    • Sprint:
      BG3_S18_02
    • Team:
      Data Release Production

      Description

      In the Gen3 butler, the tracts and patches defined by a SkyMap will be loaded into a database, and that will make it much more important to recognize when the same SkyMap has already been loaded.  While SkyMap objects already support equality comparison, it'd be nice if they could also produce a stable hash that can be used to uniquely label them.

      Since that basically amounts to being able to hash the SkyMap's configuration, I think it makes the most sense to actually add this hashing support directly to pex_config.  Being able to compare hashes to check for config equality seems like it'd be generally useful to.

      I'm currently planning to do this with hashlib.sha1, rather than just the hash builtin, because I want something that's guaranteed to be stable between Python versions.

      Note that these in-memory hashes will not be equivalent to hashes of the files in which these objects are stored.

        Attachments

          Issue Links

            Activity

            Hide
            pschella Pim Schellart [X] (Inactive) added a comment -

            +1 on this plan!

            Show
            pschella Pim Schellart [X] (Inactive) added a comment - +1 on this plan!
            Hide
            jbosch Jim Bosch added a comment -

            I'm going to descope this to just adding hash support to SkyMap directly. That's what we need right now, and doing it more generally through config is more work than I want to do at the moment.  I'm also not sure that SkyMaps would always want to use the Config functionality directly, as it takes away their control over whether changes to the configuration interface (not the actual behavior of the SkyMap) affect their hash.

            Show
            jbosch Jim Bosch added a comment - I'm going to descope this to just adding hash support to SkyMap directly. That's what we need right now, and doing it more generally through config is more work than I want to do at the moment.  I'm also not sure that SkyMaps would always want to use the Config functionality directly, as it takes away their control over whether changes to the configuration interface (not the actual behavior of the SkyMap) affect their hash.
            Hide
            jbosch Jim Bosch added a comment -

            Tim Jenness, mind taking a look at this?  For some context, this functionality is needed so we can easily tell when the skymap a user wants to add is the same as one we already have defined in a Registry.  We'll need that for conversion from Gen2 as well to avoid ingesting the same on-disk skymap twice.

             

            Show
            jbosch Jim Bosch added a comment - Tim Jenness , mind taking a look at this?  For some context, this functionality is needed so we can easily tell when the skymap a user wants to add is the same as one we already have defined in a Registry.  We'll need that for conversion from Gen2 as well to avoid ingesting the same on-disk skymap twice.  
            Hide
            tjenness Tim Jenness added a comment -

            This looks okay to me although, given the discussion, I actually expected this to be using _hash_ to define uniqueness of skymaps, and possibly internally hash(). Are you not doing this because you explicitly need SHA1 to work and skymaps are not readonly?

            One comment on PR about latin1.

            Show
            tjenness Tim Jenness added a comment - This looks okay to me although, given the discussion, I actually expected this to be using _ hash _ to define uniqueness of skymaps, and possibly internally hash() . Are you not doing this because you explicitly need SHA1 to work and skymaps are not readonly? One comment on PR about latin1.
            Hide
            tjenness Tim Jenness added a comment -

            I now see the comment about hash() above, but couldn't you still do this with _hash_ assigned to the objects so that you can use them in sets and other places in python where readonly objects that can be compared easily are useful.

            Show
            tjenness Tim Jenness added a comment - I now see the comment about hash() above, but couldn't you still do this with _ hash _ assigned to the objects so that you can use them in sets and other places in python where readonly objects that can be compared easily are useful.
            Hide
            jbosch Jim Bosch added a comment -

            Merged to master.

            Show
            jbosch Jim Bosch added a comment - Merged to master.

              People

              • Assignee:
                jbosch Jim Bosch
                Reporter:
                jbosch Jim Bosch
                Reviewers:
                Tim Jenness
                Watchers:
                Jim Bosch, Pim Schellart [X] (Inactive), Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel