Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-26600

Add ability to prune erroneous datasetTypes from gen3 registry

    XMLWordPrintable

    Details

    • Story Points:
      2
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      During development of a pipelinetask it may happen that the developer will erroneously define an incorrect datasetType (with incorrect dimensions, or storageClass, etc.). At this point it is impossible to recover without blowing away the registry or sqlite hacking because any fix to the entry will raise a ValueError: Supplied dataset type inconsistent with registry definition error.

      We should not allow the ability to delete datasets if a dataset type is associated with them that is being deleted. This would imply that the dataset type has been used. Instead a user should first delete the relevant collection/dataset before deleting the dataset type.

        Attachments

          Issue Links

            Activity

            No builds found.
            erykoff Eli Rykoff created issue -
            tjenness Tim Jenness made changes -
            Field Original Value New Value
            Labels gen3-middleware
            tjenness Tim Jenness made changes -
            Team Data Access and Database [ 10204 ]
            tjenness Tim Jenness made changes -
            Labels gen3-middleware gen2-deprecation-blocker gen3-middleware
            gruendl Robert Gruendl made changes -
            Remote Link This issue links to "Page (Confluence)" [ 25807 ]
            Hide
            jbosch Jim Bosch added a comment -

            This should probably be accompanied by a command-line tool (as a butler subcommand).

            Show
            jbosch Jim Bosch added a comment - This should probably be accompanied by a command-line tool (as a butler subcommand).
            Hide
            tjenness Tim Jenness added a comment -

            I was wondering about the user interface. How brave are we allowing anyone to force delete anything (by mistake using --force on a calexp dataset type for example). As opposed to having pipetask have an option for overriding the dataset type definition (by pruning and re-registering).

            Show
            tjenness Tim Jenness added a comment - I was wondering about the user interface. How brave are we allowing anyone to force delete anything (by mistake using --force on a calexp dataset type for example). As opposed to having pipetask have an option for overriding the dataset type definition (by pruning and re-registering).
            Hide
            jbosch Jim Bosch added a comment -

            The database constraints as defined will guarantee that we get an exception unless there are no datasets of this dataset type at all in the repo, and I was planning to embrace that and leave it to users to clean up datasets first.

            Show
            jbosch Jim Bosch added a comment - The database constraints as defined will guarantee that we get an exception unless there are no datasets of this dataset type at all in the repo, and I was planning to embrace that and leave it to users to clean up datasets first.
            Hide
            tjenness Tim Jenness added a comment -

            Ok, so you *don't* want the proposed force=True option then.

            Show
            tjenness Tim Jenness added a comment - Ok, so you * don't * want the proposed force=True option then.
            Hide
            jbosch Jim Bosch added a comment -

            I think that's where I stand, though I realize now I had only skimmed the original ticket description and hadn't noticed that part of it until now.

            Basically, it's really easy to write a super safe version of this without --force because the database constraint does all of the work, and it's a pain to write a version with --force that's any kind of safe because dataset deletion across Registry and Datastore is so fraught. So I think we need a very good reason to do --force.

            Show
            jbosch Jim Bosch added a comment - I think that's where I stand, though I realize now I had only skimmed the original ticket description and hadn't noticed that part of it until now. Basically, it's really easy to write a super safe version of this without --force because the database constraint does all of the work, and it's a pain to write a version with --force that's any kind of safe because dataset deletion across Registry and Datastore is so fraught. So I think we need a very good reason to do --force .
            nlust Nate Lust made changes -
            Link This issue relates to DM-21898 [ DM-21898 ]
            tjenness Tim Jenness made changes -
            Description During development of a {{pipelinetask}} it may happen that the developer will erroneously define an incorrect {{datasetType}} (with incorrect {{dimensions}}, or {{storageClass}}, etc.). At this point it is impossible to recover without blowing away the registry or sqlite hacking because any fix to the entry will raise a {{ValueError: Supplied dataset type inconsistent with registry definition}} error.

            It would be useful to have a {{pruneDatasetType}} command to simplify this, with an option {{force=True}} to remove all associated persisted datasets if any were {{put}} with that type.
            During development of a {{pipelinetask}} it may happen that the developer will erroneously define an incorrect {{datasetType}} (with incorrect {{dimensions}}, or {{storageClass}}, etc.). At this point it is impossible to recover without blowing away the registry or sqlite hacking because any fix to the entry will raise a {{ValueError: Supplied dataset type inconsistent with registry definition}} error.

            We should not allow the ability to delete datasets if a dataset type is associated with them that is being deleted. This would imply that the dataset type has been used. Instead a user should first delete the relevant collection/dataset before deleting the dataset type.
            tjenness Tim Jenness made changes -
            Link This issue blocks DM-26856 [ DM-26856 ]
            tjenness Tim Jenness made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            tjenness Tim Jenness made changes -
            Assignee Tim Jenness [ tjenness ]
            tjenness Tim Jenness made changes -
            Team Data Access and Database [ 10204 ] Architecture [ 10304 ]
            Hide
            tjenness Tim Jenness added a comment -

            Hopefully a quick review. I've self-reviewed it on GitHub with all the things I'm uncertain about. Probably don't need two reviewers so whoever gets there first is fine.

            Show
            tjenness Tim Jenness added a comment - Hopefully a quick review. I've self-reviewed it on GitHub with all the things I'm uncertain about. Probably don't need two reviewers so whoever gets there first is fine.
            tjenness Tim Jenness made changes -
            Reviewers Andy Salnikov, Jim Bosch [ salnikov, jbosch ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            jbosch Jim Bosch added a comment -

            Reviewed; mostly just answers to your questions.

            Show
            jbosch Jim Bosch added a comment - Reviewed; mostly just answers to your questions.
            jbosch Jim Bosch made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            salnikov Andy Salnikov added a comment -

            I reviewed it too, looks good.

            Show
            salnikov Andy Salnikov added a comment - I reviewed it too, looks good.
            tjenness Tim Jenness made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]
            tjenness Tim Jenness made changes -
            Story Points 2

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              erykoff Eli Rykoff
              Reviewers:
              Andy Salnikov, Jim Bosch
              Watchers:
              Andy Salnikov, Eli Rykoff, Jim Bosch, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  CI Builds

                  No builds found.