Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-12888

Refactor pipe_analysis scripts to be able to process data in parallel

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Won't Fix
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: pipe_analysis, QA
    • Labels:
      None
    • Team:
      Data Release Production

      Description

      As running pipe_analysis scripts on significant chunks of data becomes more common, it becomes more and more desirable for the scripts to be able to use cluster computing resources in the same ways that other driver scripts do. It seems like the tasks could be refactored to inherit from BatchParallelTask rather than just CmdLineTask. As of now, there doesn't seem to be a way to parallelize the computations/plot generation; Lauren MacArthur tested using the -j flag and that seemed to bomb on the cluster at lsst-dev.

      Currently, since the processing happens in serial, whenever I want to run the scripts on a significant chunk of data, I end up launching a slurm job array, with each subjob requesting a single core---which ends up clogging up the cluster. If IHS-576 gets implemented, that will help, but perhaps this refactoring would not be difficult and could get done sooner?

        Attachments

          Issue Links

            Activity

            Hide
            yusra Yusra AlSayyad added a comment -

            After the gen3 migration, it'll be trivially parallelizable.

            Show
            yusra Yusra AlSayyad added a comment - After the gen3 migration, it'll be trivially parallelizable.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              tmorton Tim Morton [X] (Inactive)
              Watchers:
              Hsin-Fang Chiang, Jim Bosch, Lauren MacArthur, Paul Price, Tim Morton [X] (Inactive), Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.