Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-32725

Pipeline graph visualization can't handle metadata datasets

    XMLWordPrintable

    Details

    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      When trying to run graph visualization on the pipeline in ap_verify_ci_cosmos_pdr2:*

      pipetask build --config diaPipe:apdb.db_url="sqlite:///foo/apdb.db" --pipeline-dot temp.dot -p pipelines/ApVerify.yaml
      dot temp.dot -Tpng > ApVerify.png
      

      I found that many of the metrics are displayed as disconnected pieces of the pipeline graph (see attached image). Specifically, datasets like imageDifference_metadata are shown as preexisting rather than as pipeline outputs.

      The desired behavior would be for metadata (and presumably config) datasets to be displayed as outputs of the appropriate task when they are used as inputs of another task. Presumably, they should still be omitted when they are not used by the pipeline.

      *This example cannot be reproduced with the "generic" pipelines in ap_verify because of a configuration bug (DM-32726).

        Attachments

        1. ApPipeWithFakes.png
          ApPipeWithFakes.png
          429 kB
        2. ApVerify.png
          ApVerify.png
          653 kB
        3. ApVerify-Fixed.png
          ApVerify-Fixed.png
          635 kB

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            A similar issue that I'm not sure should be a separate bug report: components are also treated as preexisting datasets; for example, calexp.bbox is unrelated to calexp. Desired behavior would be for the main dataset to be presented as the "source" (in some sense) of its components.

            EDIT: this has since been re-reported as DM-34811.

            Show
            krzys Krzysztof Findeisen added a comment - - edited A similar issue that I'm not sure should be a separate bug report: components are also treated as preexisting datasets; for example, calexp.bbox is unrelated to calexp . Desired behavior would be for the main dataset to be presented as the "source" (in some sense) of its components. EDIT: this has since been re-reported as DM-34811 .
            Hide
            nlust Nate Lust added a comment -
            Show
            nlust Nate Lust added a comment - This was fixed in https://jira.lsstcorp.org/browse/DM-34811
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Confirmed fixed, thanks for looking into this!

            Show
            krzys Krzysztof Findeisen added a comment - - edited Confirmed fixed, thanks for looking into this!

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              krzys Krzysztof Findeisen
              Watchers:
              John Parejko, Krzysztof Findeisen, Nate Lust, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.