Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-8603

Expand the Pegasus-workflow-generating script to consider multiple patches

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      In DM-8339, a Pegasus workflow following ci_hsc is made. As in ci_hsc, it includes only one patch.

      In this story, make a more general workflow, similar to ci_hsc but considers multiple patches. Same as in DM-8339, ignore orchestration details and Executor.

        Attachments

        1. dax_files.png
          dax_files.png
          6.85 MB
        2. dax_jobs.png
          dax_jobs.png
          2.11 MB
        3. HSC-I.png
          HSC-I.png
          46 kB
        4. HSC-R.png
          HSC-R.png
          50 kB

          Issue Links

            Activity

            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            Uploading the dax graphs, with and without the files, as created in this ticket.

            Show
            hchiang2 Hsin-Fang Chiang added a comment - Uploading the dax graphs, with and without the files, as created in this ticket.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            Rob Kooper:

            Previously in DM-8339, a workflow (named ciHsc) was made to reproduce the internal processing flow of the package ci_hsc. Used as CI tests in development and monitoring, ci_hsc contains all 10 major science processing tasks as of the end of 2016. It outputs all dataset types as noted here.

            However, because ci_hsc considers one patch only, it does not represent the workflow complexity in making coadd and steps beyond. Only exposures overlapped a patch of the sky are the necessary inputs for making the coadd of that patch. Some lookup or queries would be needed to know what inputs are needed for a coadd job. Understanding this complexity is one main motivation of this ticket.

            This new workflow (named miniHscDrp) uses a subset of the ci_hsc data and the same 10 tasks but considers four patches (through a different skymap config), so is more complex than the ciHsc workflow. In total this workflow has 321 files and 112 jobs:
            16 ProcessCcd (8 visits, 2 ccd each),
            1 makeSkyMap,
            32 makeCoaddTempExp (8 visits x 4 patches),
            8 assembleCoadd (4 patches x 2 filters),
            8 detectCoaddSources,
            4 mergeCoaddDetections (4 patches),
            8 measureCoaddSources,
            4 mergeCoaddMeasurements,
            8 forcedPhotCoadd,
            16 forcedPhotCcd,
            and 7 pre-runs. Plots are attached.

            Show
            hchiang2 Hsin-Fang Chiang added a comment - Rob Kooper : Previously in DM-8339 , a workflow (named ciHsc ) was made to reproduce the internal processing flow of the package ci_hsc . Used as CI tests in development and monitoring, ci_hsc contains all 10 major science processing tasks as of the end of 2016. It outputs all dataset types as noted here . However, because ci_hsc considers one patch only, it does not represent the workflow complexity in making coadd and steps beyond. Only exposures overlapped a patch of the sky are the necessary inputs for making the coadd of that patch. Some lookup or queries would be needed to know what inputs are needed for a coadd job. Understanding this complexity is one main motivation of this ticket. This new workflow (named miniHscDrp ) uses a subset of the ci_hsc data and the same 10 tasks but considers four patches (through a different skymap config), so is more complex than the ciHsc workflow. In total this workflow has 321 files and 112 jobs: 16 ProcessCcd (8 visits, 2 ccd each), 1 makeSkyMap, 32 makeCoaddTempExp (8 visits x 4 patches), 8 assembleCoadd (4 patches x 2 filters), 8 detectCoaddSources, 4 mergeCoaddDetections (4 patches), 8 measureCoaddSources, 4 mergeCoaddMeasurements, 8 forcedPhotCoadd, 16 forcedPhotCcd, and 7 pre-runs. Plots are attached.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            My mistake messing up Jira. Thank you Mikolaj for reviewing.

            Show
            hchiang2 Hsin-Fang Chiang added a comment - My mistake messing up Jira. Thank you Mikolaj for reviewing.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            Merged.

            Two known caveats at this moment:

            • There is a bit of circular logic in coming up with this workflow. Some of the late steps are not really fixed until some early steps are done. As discussed yesterday, that may mean the later part should only be "exploded" after all needed results are obtained from the early steps. Another idea (which wasn't brought up yesterday, but was a while back) is to make a supergraph of the actual graph.
            • Right now the input data file inputData.py contains codes besides data. I created DM-9234 to fix miniHscDrp/inputData.py. For now I do not intend to change ciHsc/inputData.py as they are copied from the original ci_hsc package, and not changing it may provide an easier comparison.
            Show
            hchiang2 Hsin-Fang Chiang added a comment - Merged. Two known caveats at this moment: There is a bit of circular logic in coming up with this workflow. Some of the late steps are not really fixed until some early steps are done. As discussed yesterday, that may mean the later part should only be "exploded" after all needed results are obtained from the early steps. Another idea (which wasn't brought up yesterday, but was a while back) is to make a supergraph of the actual graph. Right now the input data file inputData.py contains codes besides data. I created DM-9234 to fix miniHscDrp/inputData.py . For now I do not intend to change ciHsc/inputData.py as they are copied from the original ci_hsc package, and not changing it may provide an easier comparison.
            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            Uploading a corrected graph. The previous version has a stupid error I introduced by hand.

            Show
            hchiang2 Hsin-Fang Chiang added a comment - Uploading a corrected graph. The previous version has a stupid error I introduced by hand.

              People

              • Assignee:
                hchiang2 Hsin-Fang Chiang
                Reporter:
                hchiang2 Hsin-Fang Chiang
                Reviewers:
                Mikolaj Kowalik
                Watchers:
                Hsin-Fang Chiang, Mikolaj Kowalik, Steve Pietrowicz
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel