I've left a few minor comments on the PR. And I've got a few behavior requests, too:
If the user doesn't pass any of -purge, disassociate, or unstore, this doesn't actually do anything. -unstore seems like an okay default; and maybe we keep the option around so you can say "disassociate and also unstore"?
It'd be nice to have the dry-run functionality check for a few obvious ways things could go wrong:
- if the user passed --purge, is that collection actually a RUN-type collection, and are the DatasetRef.run attributes equal to that for all queried datasets?
- if the user passed --disassociate, is that collectoni actually a TAGGED-type collection?
It'd probably be best to perform those checks regardless of the dry-run state, to provide better error messages, but they're all things that would cause lower-level failures when not doing dry-run, so that's where it's most important.
I'm also a bit worried about possibly surprising behavior with unstore and disassociate, because there the actions are "do something to this dataset if there is something to do", and the dry-run outputs can't really tell us whether there is something to do. Maybe the best solution is to update the dry-run/confirm text according to the action, i.e: instead of "The following datasets will be removed.", we'd say:
- for disassociate, "The following datasets will be disassociated from <COLLECTION> if they are currently present in it (which is not checked):"
- for unstore, "The following datasets will be removed from any datastores in which they are present:"
Maybe someday we could make the query for datasets clever enough to only return datasets that actually would be affected, but we can't right now.
Anyhow, I may have had a lot to say, but this is actually really close, and I don't think I need to look again (unless there's something you want me to look at of course), so I'm hitting Reviewed and Approve now.