Fix Version/s: None
Unlike gen2, gen3 can work with multiple instruments and so therefore generally requires that an instrument be specified in a dataId or where clause when selecting datasets.
This requirement to always specify an "unnecessary" instrument has generated some negative feedback.
The problem with globally declaring that a registry only has one instrument is that if you ever change your mind (say adding LATISS data to an LSSTCam registry) then all the code that assume the instrument would default now breaks.
One proposal is to treat instrument as a special property (see also
DM-27152) and allow for instruments to be associated with specific collections. If you are using collection "HSC/raw/all" then it is likely that a single instrument is relevant. If a collection happens to include data from multiple instruments it would not be associated with a defaulted instrument but that would be rare and in most cases would have been deliberately configured to be like that.
I had some vague ideas for trying to do at least the schema-change part of this on
DM-26692, involving a way to let dimensions be declared as special "marker" dimensions (in the market for a better name) that would let them be associated with collections and used to control which spatial overlaps pairs are materialized. Instrument and skymap would both be this type.
I was separately wondering (when thinking about
DM-27147) about making Instrument.register responsible for creating certain collections (but not populating them). That might be another piece of the puzzle.
I have a more concrete plan for doing this that starts with
DM-27251 and runs through DM-24939. I'm not sure I'll get through it all before the middleware stable release, but it's conceivable I'll at least get the schema breakage parts in even if I don't land the functionality that would take advantage of it.
This looks like the next logical step in my query-system work, and of course it comes with useful functionality of its own. The necessary schema changes have indeed already landed.
As part of this, I also plan to make Registry methods pay attention to the collections the corresponding Butler was initialized with, as that will be a source of confusion until it's addressed. I may also remove the tags and chains arguments from Butler construction, as I think they're already confusing ways to do things that are already possible with registry calls, and they may get more confusing after this change.
Happy new year, welcome back, here's a big review!
At least it isn't an enormous one, but there are a number of things going on here even though they're all related to the goal of the ticket. As usual, I've tried to make reviewing commit-by-commit make sense and at least a bit easier than looking at all of the changes together, not least because it'll make it clear which things are just moves and mechanical changes.
Changes are essentially all in daf_butler, with a little bit of adjustment downstream in ctrl_mpexec.
I have not thought through how we would modify collections to have this additional metadata or what would need to change to make use of this metadata.
Presumably raw ingest and calibration ingest would know how to declare the instrument for the collection (maybe complaining if a different instrument was already registered) and if we create an output collection it would have to be configured with the instrument from all the input collections.
The quantum graph builder and butler.get would have to look up the input collection and add the dataId.