Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-26336

Prototype and design work for dimensions/queries system improvements

    XMLWordPrintable

    Details

    • Story Points:
      6
    • Epic Link:
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      daf.butler.registry.queries is getting creaky and hard to work with; usage patterns and needs have slowly diverged from what it was originally designed to do, and it's accumulated a lot of bad encapsulation, tight coupling, weird and hard-to-document method preconditions. This has gotten bad enough that it's very hard to include support for querying CALIBRATION collections on DM-24432 without major changes.

      We also have a number of related changes on the horizon that we should at least consider in any cleanup or refactoring we do now:

      • We need to make additive changes to the dimensions system (adding new dimensions, adding new metadata columns) less disruptive in terms of schema versioning.
      • We have discussed replacing the commonSkyPix/HTM-based spatial joins with a system where the user explicitly declares up front the combinations of skymap, instrument, and skypix systems they want to use together, and then we just instantiate those relationship tables up-front. That should make for faster queries, simpler query code (no trimming of query rows in postprocessing) and perhaps more configurable flexibility in the future for the kinds of relationships that can exist.
      • We need to be able to give users more control over which variants of at least some spatial relationships are used in a query. The use case we have is "give me all visit+detector data IDs for which the visit overlaps the tract, even if the visit+detector doesn't, and all HTM IDs at some level that overlap that visit+detector even if they don't overlap the tract", but I'd like to frame that problem in a more abstract sense before trying to solve it.

      This ticket is only for design/prototype work, not actually implementing all of the above changes on master.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            Most of the prototyping work done here is captured in https://confluence.lsstcorp.org/display/DM/Conclusions+and+Proposals+from+DM-26336+Prototyping, which is a high-level summary of changes I'd like to make in the future. I think that's worth reviewing, even though there probably isn't enough there for anyone else to take those descriptions and do the work, if they want to take advantage of the prototyping I've done.

            The full prototyping branch remains at u/jbosch/DM-26336/prototyping, but it's really best considered just "notes to self" at this point. I can clean up and describe various bits of it more as needed in order for others to take over some of the work (e.g. DM-26407). But I don't think that's worth doing across the board given that I'll probably be doing a lot of it.

            There is also a tickets/DM-26336 branch with a PR (https://github.com/lsst/daf_butler/pull/368) that I'd like to get reviewed and merged now. It includes some non-disruptive, noncontroversial baby steps towards the prototyped vision as well as improvements to the named containers module that I've used in the prototype (but can stand on their own).

            Show
            jbosch Jim Bosch added a comment - Most of the prototyping work done here is captured in https://confluence.lsstcorp.org/display/DM/Conclusions+and+Proposals+from+DM-26336+Prototyping , which is a high-level summary of changes I'd like to make in the future. I think that's worth reviewing, even though there probably isn't enough there for anyone else to take those descriptions and do the work, if they want to take advantage of the prototyping I've done. The full prototyping branch remains at u/jbosch/ DM-26336 /prototyping, but it's really best considered just "notes to self" at this point. I can clean up and describe various bits of it more as needed in order for others to take over some of the work (e.g. DM-26407 ). But I don't think that's worth doing across the board given that I'll probably be doing a lot of it. There is also a tickets/ DM-26336 branch with a PR ( https://github.com/lsst/daf_butler/pull/368 ) that I'd like to get reviewed and merged now. It includes some non-disruptive, noncontroversial baby steps towards the prototyped vision as well as improvements to the named containers module that I've used in the prototype (but can stand on their own).
            Hide
            salnikov Andy Salnikov added a comment -

            Looks good.

            Show
            salnikov Andy Salnikov added a comment - Looks good.

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Andy Salnikov
              Watchers:
              Andy Salnikov, Jim Bosch
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.