# sort collections before pruning in butler prune-collection

XMLWordPrintable

#### Details

• Type: Story
• Status: Reviewed
• Resolution: Unresolved
• Fix Version/s: None
• Component/s: None
• Labels:
None
• Sprint:
DB_S21_12
• Team:
Data Access and Database
• Urgent?:
No

#### Description

see discussion here

butler prune-collection should sort collections that match the expression it was given in parent->child dependency order before it tries to delete them.

#### Activity

Hide
Nate Pease added a comment -

Jim Bosch can we add an argument --unlink <parent> so a chained collection will be removed from its parent before being removed, with the caller providing the parent that it should be removed from? I'll include help txt below, and there's a PR you can look at.

 $butler prune-collection --help Usage: butler prune-collection [OPTIONS] REPO COLLECTION      Remove a collection and possibly prune datasets within it.      REPO is the URI or path to an existing data repository root or  configuration file.      COLLECTION is the Name of the collection to remove. If this is a tagged or  chained collection, datasets within the collection are not modified unless  --unstore is passed. If this is a run collection, --purge and --unstore  must be passed, and all datasets in it are fully removed from the data  repository.     Options:  --purge Permit RUN collections to be removed, fully  removing datasets within them. Requires --unstore  as an added precaution against accidental deletion.  Must not be passed if the collection is not a RUN.      --unstore Remove all datasets in the collection from all  datastores in which they appear.      --unlink TEXT Unlink given child COLLECTION from this parent  collection.      -@, --options-file TEXT URI to YAML file containing overrides of command  line options. The YAML should be organized as a  hierarchy with subcommand names at the top level  options for that subcommand below.      -h, --help Show this message and exit.      See 'butler --help' for more options.   Show Nate Pease added a comment - Jim Bosch can we add an argument --unlink <parent> so a chained collection will be removed from its parent before being removed, with the caller providing the parent that it should be removed from? I'll include help txt below, and there's a PR you can look at.$ butler prune-collection --help Usage: butler prune-collection [OPTIONS] REPO COLLECTION       Remove a collection and possibly prune datasets within it.       REPO is the URI or path to an existing data repository root or   configuration file.       COLLECTION is the Name of the collection to remove. If this is a tagged or   chained collection, datasets within the collection are not modified unless   --unstore is passed. If this is a run collection, --purge and --unstore   must be passed, and all datasets in it are fully removed from the data   repository.     Options:   --purge                  Permit RUN collections to be removed, fully                            removing datasets within them. Requires --unstore                            as an added precaution against accidental deletion.                            Must not be passed if the collection is not a RUN.       --unstore                Remove all datasets in the collection from all                            datastores in which they appear.       --unlink TEXT            Unlink given child COLLECTION from this parent                            collection.       -@, --options-file TEXT  URI to YAML file containing overrides of command                            line options. The YAML should be organized as a                            hierarchy with subcommand names at the top level                            options for that subcommand below.       -h, --help               Show this message and exit.       See 'butler --help' for more options.
Hide
Jim Bosch added a comment -

Good idea, this will give us the functionality we need via something simple enough to implement quickly.

Want to make unlike a "multiple" arg to allow it to be passed multiple times? That will be rare, so I still think it's appropriate to tailer the interface towards single parents, but since it's possible for a collection to have multiple parents, and it seems like an easy extension, allowing multiple makes sense.

Show
Jim Bosch added a comment - Good idea, this will give us the functionality we need via something simple enough to implement quickly. Want to make unlike a "multiple" arg to allow it to be passed multiple times? That will be rare, so I still think it's appropriate to tailer the interface towards single parents, but since it's possible for a collection to have multiple parents, and it seems like an easy extension, allowing multiple makes sense.
Hide
Nate Pease added a comment -

I changed --unlink to allow multiple parents. Please review when you have a chance, thx!

Show
Nate Pease added a comment - I changed --unlink to allow multiple parents. Please review when you have a chance, thx!
Hide
Jim Bosch added a comment -

A few minor comments on the PR, but overall looks good!

Show
Jim Bosch added a comment - A few minor comments on the PR, but overall looks good!

#### People

Assignee:
Nate Pease
Reporter:
Nate Pease
Reviewers:
Jim Bosch
Watchers:
Jim Bosch, Nate Pease