Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-3591

Add support for registry-free repository

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: butler
    • Labels:
      None
    • Story Points:
      12
    • Sprint:
      DB_W16_09, DB_W16_10
    • Team:
      Data Access and Database

      Description

      Existing butler unit tests should run without an sqlite database registry.

        Attachments

          Issue Links

            Activity

            Hide
            price Paul Price added a comment -

            Could you explain what you mean, please? The butler already supports data repos without a registry.

            Show
            price Paul Price added a comment - Could you explain what you mean, please? The butler already supports data repos without a registry.
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            As I understand it, the Butler currently requires that alongside data there be an sqlite db that contains information about what data is there. To satisfy this story the Butler would look in the repository location, find the data sources by file name and use that found information as a registry (thus eliminating the need/requirement for an sqlite db).

            Show
            npease Nate Pease [X] (Inactive) added a comment - As I understand it, the Butler currently requires that alongside data there be an sqlite db that contains information about what data is there. To satisfy this story the Butler would look in the repository location, find the data sources by file name and use that found information as a registry (thus eliminating the need/requirement for an sqlite db).
            Hide
            price Paul Price added a comment -

            I believe the registry is not strictly necessary, but only required for look-ups when there's insufficient information. For example, with HSC I often want to select CCDs by "field" and "filter", and the registry is used to provide the list of "visit", "ccd" and other information required to satisfy the filenames.

            I believe the LSST ImSim data doesn't use a registry, and the user just has to provide all the required information to satisfy the filenames.

            Show
            price Paul Price added a comment - I believe the registry is not strictly necessary, but only required for look-ups when there's insufficient information. For example, with HSC I often want to select CCDs by "field" and "filter", and the registry is used to provide the list of "visit", "ccd" and other information required to satisfy the filenames. I believe the LSST ImSim data doesn't use a registry, and the user just has to provide all the required information to satisfy the filenames.
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            One of the things KT & discussed when he was giving me the crash course was to use python's glob function to add lookup, so maybe the point of this story is the wildcard lookup. Probably Kian-Tat Lim should weigh in.

            Show
            npease Nate Pease [X] (Inactive) added a comment - One of the things KT & discussed when he was giving me the crash course was to use python's glob function to add lookup, so maybe the point of this story is the wildcard lookup. Probably Kian-Tat Lim should weigh in.
            Hide
            ktl Kian-Tat Lim added a comment -

            There are two aspects to this. The first is lookup of keys not provided in the dataId, a function which pertains to both get() calls with less-than-complete information and queryMetadata() calls that almost always have less-than-complete information. All obs_* packages, including obs_lsstSim, currently use the registry for that second purpose. The second aspect, which is more complex and which I hadn't yet sprung on Nate, is reading information (in particular the observation time and length) out of an input dataset's file representation in order to provide rendezvous with calibration data in another repository (that does have a registry). Today, that read is handled by genInputRegistry.py so that the butler doesn't need to look into the dataset itself. If there's no registry, such a read will be necessary.

            Show
            ktl Kian-Tat Lim added a comment - There are two aspects to this. The first is lookup of keys not provided in the dataId, a function which pertains to both get() calls with less-than-complete information and queryMetadata() calls that almost always have less-than-complete information. All obs_* packages, including obs_lsstSim , currently use the registry for that second purpose. The second aspect, which is more complex and which I hadn't yet sprung on Nate, is reading information (in particular the observation time and length) out of an input dataset's file representation in order to provide rendezvous with calibration data in another repository (that does have a registry). Today, that read is handled by genInputRegistry.py so that the butler doesn't need to look into the dataset itself. If there's no registry, such a read will be necessary.
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            The second part KT mentions is captured in DM-3765

            Show
            npease Nate Pease [X] (Inactive) added a comment - The second part KT mentions is captured in DM-3765
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            added Simon to review the pull request in obs_decam

            Show
            npease Nate Pease [X] (Inactive) added a comment - added Simon to review the pull request in obs_decam
            Hide
            ktl Kian-Tat Lim added a comment -

            Some comments in the PR in daf_butlerUtils. The obs_decam change looks OK to me. As I understand it, no change will be made to obs_test, and the spurious "DM-3591" branch in obs_decam will also be removed.

            Show
            ktl Kian-Tat Lim added a comment - Some comments in the PR in daf_butlerUtils. The obs_decam change looks OK to me. As I understand it, no change will be made to obs_test, and the spurious " DM-3591 " branch in obs_decam will also be removed.

              People

              Assignee:
              npease Nate Pease [X] (Inactive)
              Reporter:
              fritzm Fritz Mueller
              Reviewers:
              Kian-Tat Lim, Simon Krughoff
              Watchers:
              Jacek Becla, Kian-Tat Lim, Nate Pease [X] (Inactive), Paul Price
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.