Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-30487

Butler export from QG needs to include more relationship dimensions

    XMLWordPrintable

    Details

    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      The QuantumGraph butler-export logic developed for the execution butler system has a subtle flaw: it exports data IDs (and hence DimensionRecords) for all datasets, but that can miss some "relationship" DimensionElements like "visit_definition", which only appear in data IDs that have both of the dimensions they relate ("exposure" and "visit").

      To fix this, it should be sufficient to extend each dataset's data ID with any key-value pairs also present in its quantum's data ID, and then export that.  Something like this:

      full = registry.expandDataId(dataset.dataId, **quantum.dataId.byName())

      Unfortunately that will be super slow to run on all of the quanta in a big graph.  We probably ought to think about ways to save these records in the QG itself to make it more self-suffiicent; we get them (moderately efficiently, in bulk) at QG generation, and then throw them way.  It's either that or wait until we can get bulk data ID expansion working on DM-30438.

        Attachments

          Issue Links

            Activity

            There are no comments yet on this issue.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jbosch Jim Bosch
              Watchers:
              Jim Bosch, Nate Lust
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:

                  Jenkins

                  No builds found.