Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Labels:
-
Story Points:6
-
Epic Link:
-
Team:Data Release Production
-
Urgent?:No
Description
daf.butler.registry.queries is getting creaky and hard to work with; usage patterns and needs have slowly diverged from what it was originally designed to do, and it's accumulated a lot of bad encapsulation, tight coupling, weird and hard-to-document method preconditions. This has gotten bad enough that it's very hard to include support for querying CALIBRATION collections on DM-24432 without major changes.
We also have a number of related changes on the horizon that we should at least consider in any cleanup or refactoring we do now:
- We need to make additive changes to the dimensions system (adding new dimensions, adding new metadata columns) less disruptive in terms of schema versioning.
- We have discussed replacing the commonSkyPix/HTM-based spatial joins with a system where the user explicitly declares up front the combinations of skymap, instrument, and skypix systems they want to use together, and then we just instantiate those relationship tables up-front. That should make for faster queries, simpler query code (no trimming of query rows in postprocessing) and perhaps more configurable flexibility in the future for the kinds of relationships that can exist.
- We need to be able to give users more control over which variants of at least some spatial relationships are used in a query. The use case we have is "give me all visit+detector data IDs for which the visit overlaps the tract, even if the visit+detector doesn't, and all HTM IDs at some level that overlap that visit+detector even if they don't overlap the tract", but I'd like to frame that problem in a more abstract sense before trying to solve it.
This ticket is only for design/prototype work, not actually implementing all of the above changes on master.
Attachments
Issue Links
- blocks
-
DM-24432 Add CALIBRATION collections and remove the calibration_label dimension
- Done
Most of the prototyping work done here is captured in https://confluence.lsstcorp.org/display/DM/Conclusions+and+Proposals+from+DM-26336+Prototyping, which is a high-level summary of changes I'd like to make in the future. I think that's worth reviewing, even though there probably isn't enough there for anyone else to take those descriptions and do the work, if they want to take advantage of the prototyping I've done.
The full prototyping branch remains at u/jbosch/
DM-26336/prototyping, but it's really best considered just "notes to self" at this point. I can clean up and describe various bits of it more as needed in order for others to take over some of the work (e.g.DM-26407). But I don't think that's worth doing across the board given that I'll probably be doing a lot of it.There is also a tickets/
DM-26336branch with a PR (https://github.com/lsst/daf_butler/pull/368) that I'd like to get reviewed and merged now. It includes some non-disruptive, noncontroversial baby steps towards the prototyped vision as well as improvements to the named containers module that I've used in the prototype (but can stand on their own).