Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-26600

Add ability to prune erroneous datasetTypes from gen3 registry

    XMLWordPrintable

    Details

    • Story Points:
      2
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      During development of a pipelinetask it may happen that the developer will erroneously define an incorrect datasetType (with incorrect dimensions, or storageClass, etc.). At this point it is impossible to recover without blowing away the registry or sqlite hacking because any fix to the entry will raise a ValueError: Supplied dataset type inconsistent with registry definition error.

      We should not allow the ability to delete datasets if a dataset type is associated with them that is being deleted. This would imply that the dataset type has been used. Instead a user should first delete the relevant collection/dataset before deleting the dataset type.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            This should probably be accompanied by a command-line tool (as a butler subcommand).

            Show
            jbosch Jim Bosch added a comment - This should probably be accompanied by a command-line tool (as a butler subcommand).
            Hide
            tjenness Tim Jenness added a comment -

            I was wondering about the user interface. How brave are we allowing anyone to force delete anything (by mistake using --force on a calexp dataset type for example). As opposed to having pipetask have an option for overriding the dataset type definition (by pruning and re-registering).

            Show
            tjenness Tim Jenness added a comment - I was wondering about the user interface. How brave are we allowing anyone to force delete anything (by mistake using --force on a calexp dataset type for example). As opposed to having pipetask have an option for overriding the dataset type definition (by pruning and re-registering).
            Hide
            jbosch Jim Bosch added a comment -

            The database constraints as defined will guarantee that we get an exception unless there are no datasets of this dataset type at all in the repo, and I was planning to embrace that and leave it to users to clean up datasets first.

            Show
            jbosch Jim Bosch added a comment - The database constraints as defined will guarantee that we get an exception unless there are no datasets of this dataset type at all in the repo, and I was planning to embrace that and leave it to users to clean up datasets first.
            Hide
            tjenness Tim Jenness added a comment -

            Ok, so you *don't* want the proposed force=True option then.

            Show
            tjenness Tim Jenness added a comment - Ok, so you * don't * want the proposed force=True option then.
            Hide
            jbosch Jim Bosch added a comment -

            I think that's where I stand, though I realize now I had only skimmed the original ticket description and hadn't noticed that part of it until now.

            Basically, it's really easy to write a super safe version of this without --force because the database constraint does all of the work, and it's a pain to write a version with --force that's any kind of safe because dataset deletion across Registry and Datastore is so fraught. So I think we need a very good reason to do --force.

            Show
            jbosch Jim Bosch added a comment - I think that's where I stand, though I realize now I had only skimmed the original ticket description and hadn't noticed that part of it until now. Basically, it's really easy to write a super safe version of this without --force because the database constraint does all of the work, and it's a pain to write a version with --force that's any kind of safe because dataset deletion across Registry and Datastore is so fraught. So I think we need a very good reason to do --force .
            Hide
            tjenness Tim Jenness added a comment -

            Hopefully a quick review. I've self-reviewed it on GitHub with all the things I'm uncertain about. Probably don't need two reviewers so whoever gets there first is fine.

            Show
            tjenness Tim Jenness added a comment - Hopefully a quick review. I've self-reviewed it on GitHub with all the things I'm uncertain about. Probably don't need two reviewers so whoever gets there first is fine.
            Hide
            jbosch Jim Bosch added a comment -

            Reviewed; mostly just answers to your questions.

            Show
            jbosch Jim Bosch added a comment - Reviewed; mostly just answers to your questions.
            Hide
            salnikov Andy Salnikov added a comment -

            I reviewed it too, looks good.

            Show
            salnikov Andy Salnikov added a comment - I reviewed it too, looks good.

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              erykoff Eli Rykoff
              Reviewers:
              Andy Salnikov, Jim Bosch
              Watchers:
              Andy Salnikov, Eli Rykoff, Jim Bosch, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  CI Builds

                  No builds found.