Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-17023

Refactor the Dimensions and query system

    Details

      Description

      The dimension package introduced in DM-15034 and modified in DM-15675 are a big improvement on what we had before, but the logic for constructing and expanding DataIds in particular is getting quite complex and hard to follow.  Issues include:

      • Use of __new__ for DataId to avoid unnecessary copies
      • (Intentional) pointer aliasing in DataId.entries.
      • Upgrading and downgrading of implied-dependency links from DataId.entries to the main dictionary.
      • Lots of code to traverse DimensionGraphs in different ways in expandDataId, SqlPreFlight, and DimensionGraph itself, none of which can currently be expressed as an iterator.
      • Nomenclature is still confusing: because a Dimension instance represents a table, we're stuck with "entry" as a row in that table, and "link" as its primary key.

      These work well and solve real problems, but they also make it hard to reason about what's going on.

      There are also some problems on the horizon that we need to solve; any major refactor at this stage should at least consider these as well:

      • The ExposureRange Dimension is totally special-cased in preflight right now, because we need different entries for different DatasetTypes within the same Quantum. But this will also be true of other Dimensions in future PipelineTasks (especially if we ever unify Exposure and Visit into Observation).
      • We need to add a level of indirection between data ID keys and Dimension field names, to permit different Instruments to have different preferred observation IDs associated with different database fields.
      • We need to add support for per-Instrument Dimension metadata tables.
      • We'd like to remove the need for PipelineTasks to deal with both uppercase Dimension names and lower-case link names, since there's almost a one-to-one mapping between them (and there will be once ExposureRange is turned into a join for CalibIdentifier).

      This is not a high priority, as we don't need any of this to get the current set of PipelineTasks working (though it may let us avoid hacks in CPP PipelineTasks).

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            The megaticket is almost passing Jenkins and is ready for final review.  Once again I'm trying to split it up between Tim Jenness and Andy Salnikov, and since there are a lot of packages but also a lot of code that has already been reviewed, I'll point out the commits that are new and who I'd like to look at them:

            It's also possible there will be an obs_lsst branch added later, if Tim lands Gen3 support there before I can break his branch .

            Show
            jbosch Jim Bosch added a comment - The megaticket is almost passing Jenkins and is ready for final review.  Once again I'm trying to split it up between Tim Jenness and Andy Salnikov , and since there are a lot of packages but also a lot of code that has already been reviewed, I'll point out the commits that are new and who I'd like to look at them: daf_butler : new commits for both; see comment on PR. obs_base : already fully reviewed (by Tim); only changes are addressing the last review. pipe_base , meas_base , pipe_tasks , skymap : Andy, all commits new. obs_subaru : mostly already reviewed by Tim; one new commit for Andy. obs_decam : Tim, all commits new. ctrl_mpexec : Andy, all commits new. ci_hsc_gen2 : already fully reviewed (by Tim); only changes are addressing the last review. ci_hsc_gen3 : Tim, all commits new. It's also possible there will be an obs_lsst branch added later, if Tim lands Gen3 support there before I can break his branch .
            Hide
            salnikov Andy Salnikov added a comment -

            I checked all PRs that are assigned to me, everything looks good. Removing myself from reviewers.

            Show
            salnikov Andy Salnikov added a comment - I checked all PRs that are assigned to me, everything looks good. Removing myself from reviewers.
            Hide
            tjenness Tim Jenness added a comment -

            I had some minor comments on the pull requests. This is an amazing piece of work.

            Show
            tjenness Tim Jenness added a comment - I had some minor comments on the pull requests. This is an amazing piece of work.
            Hide
            jbosch Jim Bosch added a comment - - edited

            Alrighty, three more packages with changes, so one more extra-final set of (trivial) reviews:

             

            Show
            jbosch Jim Bosch added a comment - - edited Alrighty, three more packages with changes, so one more extra-final set of (trivial) reviews: Andy Salnikov : obs_cfht , obs_ctio0m9 (more of the same config overrides for reference catalogs you saw before) Tim Jenness : obs_lsst (config overrides for reference catalogs and API change updates for the new Instrument classes)  
            Hide
            salnikov Andy Salnikov added a comment -

            I approved both PRs.

            Show
            salnikov Andy Salnikov added a comment - I approved both PRs.

              People

              • Assignee:
                jbosch Jim Bosch
                Reporter:
                jbosch Jim Bosch
                Reviewers:
                Tim Jenness
                Watchers:
                Andy Salnikov, Jim Bosch, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel