Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Labels:
-
Story Points:2
-
Epic Link:
-
Team:Data Release Production
-
Urgent?:No
Description
Registry currently allows CHAINED collections to associate each child collection with a restriction on the dataset types (and recently, on DM-27251, governor dimension values) to search for within it. I don't think we actually need that - especially after DM-24939 and any follow-on tickets are done - and it adds a lot of complexity.
To say a bit more, I added that functionality back when I introduced CHAINED collections because it was the only way to continue to support pipetask's interface for setting which dataset types to search for in each input collection during QG generation. That interface has never been used much, but I didn't want to remove it at the time because QG gen was still quite slow, and asking users to utilize that more cumbersome (but informative, from butler's perspective) interface seemed like a way to potentially write much smarter queries. Since then, other changes to the query system have made that less important and DM-24939 (or a follow-on ticket, if that ends up just being the schema-change parts) will pretty much finish it off as a potential optimization (by putting the information we would have been getting from the user in the database instead).
Attachments
Issue Links
- relates to
-
DM-27033 Integration of pre-middleware-release dimensions changes
- Done
Andy Salnikov, this should be my last review request (of you) for the schema-stability tickets; it's mostly deletions. The daf_butler PR is here.
Nate Pease [X], could you take a look at the ctrl_mpexec, particularly with an eye towards whether the command-line changes reflect the fact that the -i argument(s) should now just form a list of str, with no colon-separated anythings in play? And are there any other commands besides pipetask that would need to be updated?