Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Story Points:5
-
Epic Link:
-
Sprint:DB_F20_09
-
Team:Data Access and Database
-
Urgent?:No
Description
Create a command-line tool for Registry.queryDatasets as a butler subcommand.
This method has a large set of optional keyword arguments, and I'm not sure they all need to be included in the command-line interface, especially at first. I would certainly skip the dataId argument, as from the command-line it's completely redundant with the where argument. The dimensions argument is probably also less likely to be generally useful.
The other consideration when defining this command-line interface is that we also have many Registry and Butler methods whose primary inputs are the outputs of queryDatasets - in other words, anything that takes an iterable of DatasetRefs in Python will probably need a command-line script that first runs queryDatasets and forwards the DatasetRefs to the method being wrapped. That means all of those commands will probably need to have many of the same command-line option, but sometimes the names will need to change to avoid clashes. For example, Registry.associate takes a "collection" argument of its own, so we'd probably want to map that to something like --output-collection while calling the collection(s) passed to queryDatasets --input-collections.
Also, you (Nate Pease [X]) are more familiar with the command-line argument names already in use than I am, so please don't take anything I've written above as authoritative; consistency across commands is more important than anything I've written here (as long as consistency doesn't lead to ambiguity, of course).
Attachments
Issue Links
- blocks
-
DM-26686 Add command-line tools for Registry.decertify.
- To Do
-
DM-26688 Add command-line tool for Registry.associate
- Done
-
DM-26689 Add command-line tool for Butler.pruneDatasets
- Done
- relates to
-
DM-21898 Create command-line tools for Gen3 repo administration
- Won't Fix
- mentioned in
-
Page Loading...
One of us should change the Registry API either on this ticket or very soon after, and I think a few other places "deduplicate" appears in lower-level code as well. I would be happy to not be the one to do it, and in daf_butler a straightforward grep for "deduplicate" should find all occurrences. But there may be some downstream breakage from that, too.