Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-21764

Better encapsulate dataset storage in Registry

    XMLWordPrintable

    Details

      Description

      Move registry code involving datasets (and, as needed, runs and collections) into helper classes as the prototype. If possible, defer actually moving from monolithic to split dataset tables to DM-21766.

        Attachments

          Issue Links

            Activity

            No builds found.
            jbosch Jim Bosch created issue -
            jbosch Jim Bosch made changes -
            Field Original Value New Value
            Epic Link DM-21254 [ 414685 ]
            jbosch Jim Bosch made changes -
            Link This issue is contained by DM-21231 [ DM-21231 ]
            jbosch Jim Bosch made changes -
            Link This issue blocks DM-21766 [ DM-21766 ]
            jbosch Jim Bosch made changes -
            Link This issue blocks DM-21794 [ DM-21794 ]
            jbosch Jim Bosch made changes -
            Labels gen3-middleware gen2-deprecation-blocker gen3-middleware
            yusra Yusra AlSayyad made changes -
            Epic Link DM-21254 [ 414685 ] DM-22586 [ 427653 ]
            jbosch Jim Bosch made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            jbosch Jim Bosch made changes -
            Description Following the pattern established in DM-17023, have Registry operations on datasets delegate to a polymorphic class hierarchy whose instances represent the storage for a single dataset type.

            This should include different subclasses for:
             * the current monolothic (one dataset table) approach
             * a hybrid with both a thin table for all datasets and wider tables (with dimension links) for each dataset type
             * (at prototype level) a chain of nested instances, for use with multi-user registries

            An important question is whether the new classes should:
             # just use SQLAlchemy objects directly, given them a one-way composition relationship with Registry, but no way for Registry subclasses to specialize database operations;
             # have a back-pointer to their Registry, and delegate to it for all low-level database operations.

            {{DimensionRecordStorage}} currently does (1), but (2) would work better with the new insert-with-conflict-resolution method added on DM-21201, especially if that shakes up transaction handling.
            Move registry code involving datasets (and, as needed, runs and collections) into helper classes as [the prototype|per https://confluence.lsstcorp.org/display/DM/Architectural+Prototype+for+the+New+Gen3+Registry]. If possible, defer actually moving from monolithic to split dataset tables to DM-21766.
            jbosch Jim Bosch made changes -
            Story Points 8 4
            yusra Yusra AlSayyad made changes -
            Epic Link DM-22586 [ 427653 ] DM-23737 [ 431393 ]
            tjenness Tim Jenness made changes -
            Labels gen2-deprecation-blocker gen3-middleware gen2-deprecation-blocker gen3-middleware gen3-registry-incompatibility
            jbosch Jim Bosch made changes -
            Link This issue blocks DM-24432 [ DM-24432 ]
            jbosch Jim Bosch made changes -
            Description Move registry code involving datasets (and, as needed, runs and collections) into helper classes as [the prototype|per https://confluence.lsstcorp.org/display/DM/Architectural+Prototype+for+the+New+Gen3+Registry]. If possible, defer actually moving from monolithic to split dataset tables to DM-21766. Move registry code involving datasets (and, as needed, runs and collections) into helper classes as [the prototype|https://confluence.lsstcorp.org/display/DM/Architectural+Prototype+for+the+New+Gen3+Registry]. If possible, defer actually moving from monolithic to split dataset tables to DM-21766.
            Hide
            jbosch Jim Bosch added a comment -

            Restarting work on this now. I currently plan to do DM-21766 on the same branch (except they'll actually be per-sets-of-dimensions tables, not per-dataset-type), but not DM-21794 or DM-24432 (which currently seem no harder to do later).

            Show
            jbosch Jim Bosch added a comment - Restarting work on this now. I currently plan to do DM-21766 on the same branch (except they'll actually be per-sets-of-dimensions tables, not per-dataset-type), but not DM-21794 or DM-24432 (which currently seem no harder to do later).
            jbosch Jim Bosch made changes -
            Link This issue blocks DM-24612 [ DM-24612 ]
            jbosch Jim Bosch made changes -
            Link This issue blocks DM-24614 [ DM-24614 ]
            Hide
            jbosch Jim Bosch added a comment -

            Andy Salnikov, here's another fairly large review for you, but I'm hoping it will at least be familiar given that it's a refactoring pattern I know I've asked you to review before. All changes are in daf_butler (so far; running Jenkins now but I don't expect other packages to break), and most commits are small cleanups or improvements that either set up or react to the three big ones, which:

            Jira seems to picked up an irrelevant PR as well as the right one, which is https://github.com/lsst/daf_butler/pull/266.

            Show
            jbosch Jim Bosch added a comment - Andy Salnikov , here's another fairly large review for you, but I'm hoping it will at least be familiar given that it's a refactoring pattern I know I've asked you to review before. All changes are in daf_butler (so far; running Jenkins now but I don't expect other packages to break), and most commits are small cleanups or improvements that either set up or react to the three big ones, which: add ABCs for helper classes for datasets in Registry ; provide default implementations for those ; switch Registry over to using them. Jira seems to picked up an irrelevant PR as well as the right one, which is https://github.com/lsst/daf_butler/pull/266 .
            jbosch Jim Bosch made changes -
            Reviewers Andy Salnikov [ salnikov ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            salnikov Andy Salnikov added a comment -

            Looks OK, few comments on PR.

            Show
            salnikov Andy Salnikov added a comment - Looks OK, few comments on PR.
            salnikov Andy Salnikov made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            jbosch Jim Bosch made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Andy Salnikov
              Watchers:
              Andy Salnikov, Jim Bosch
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.