Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-5681

Provide single-visit processing capability as required by HSC

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: pipe_tasks
    • Labels:
      None

      Description

      In DM-3368, we provided a means of running multiple processCcd tasks across an exposure, but without performing global calibration etc as provided by HSC's ProcessExposureTask.

      Please augment this with whatever additional capability is required to enable HSC data release processing.

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            Looking at HSC's ProcessExposureTask, I see four global tasks being performed after per-CCD processing:

            • focus
            • photometry
            • astrometry
            • curve of growth

            We know (DM-2913) that the curve of growth code won't be ported to LSST.

            At the Monday morning meeting of 4 April, we discussed the photometric & astrometric solutions and agreed that they did not need to be provided by LSST to regard the HSC port as complete. I think, therefore, that no work on them is required here. Paul Price, do you agree?

            Per Paul Price's clo posting of 4 April, "we can move [the focus solution] into its own job". Can you clarify exactly what this means? In order to close this issue, do we need to provide a new top-level task which provides distributed focus measurement?

            Show
            swinbank John Swinbank added a comment - Looking at HSC's ProcessExposureTask , I see four global tasks being performed after per-CCD processing: focus photometry astrometry curve of growth We know ( DM-2913 ) that the curve of growth code won't be ported to LSST. At the Monday morning meeting of 4 April, we discussed the photometric & astrometric solutions and agreed that they did not need to be provided by LSST to regard the HSC port as complete. I think, therefore, that no work on them is required here. Paul Price , do you agree? Per Paul Price 's clo posting of 4 April , "we can move [the focus solution] into its own job". Can you clarify exactly what this means? In order to close this issue, do we need to provide a new top-level task which provides distributed focus measurement?
            Hide
            price Paul Price added a comment -

            I will:
            1. Move the focus measurement into its own script. We don't need to run it together with the science CCDs. Nate Lust can ignore the focus CCDs in his port.
            2. Change the singleFrameDriver to operate over CCDs instead of exposures. This will let us parallelise much more widely and remove inefficiency from blocking on the exposure gather/scatter.

            Show
            price Paul Price added a comment - I will: 1. Move the focus measurement into its own script. We don't need to run it together with the science CCDs. Nate Lust can ignore the focus CCDs in his port. 2. Change the singleFrameDriver to operate over CCDs instead of exposures. This will let us parallelise much more widely and remove inefficiency from blocking on the exposure gather/scatter.
            Hide
            price Paul Price added a comment -

            Moving the focus measurement to its own script has been deferred to DM-5904.

            Show
            price Paul Price added a comment - Moving the focus measurement to its own script has been deferred to DM-5904 .
            Hide
            price Paul Price added a comment -

            Thanks for volunteering to review this, Simon Krughoff! There is a single commit in each of ctrl_pool and pipe_drivers.

            pprice@tiger-sumire:/tigress/pprice/dm-5681/ctrl_pool (tickets/DM-5681=) $ git sub
            commit d00b4eb14b749a16b2a9748ed59f35f2c682c5ea
            Author: Paul Price <price@astro.princeton.edu>
            Date:   Fri Apr 29 14:21:48 2016 -0400
             
                add BatchParallelTask to provide simple iteration
                
                This new subclass of BatchCmdLineTask uses the MPI process pool as a
                mere iteration device, like a multi-node version of the '-j'
                command-line argument. Contrast this with BatchPoolTask, which allows
                the user to use the process pool directly (so they can do their own
                scatter/gather workflows).
             
             python/lsst/ctrl/pool/parallel.py | 73 +++++++++++++++++++++++++++++++++++++--
             1 file changed, 71 insertions(+), 2 deletions(-)
             
             
            upprice@tiger-sumire:/tigress/pprice/dm-5681/pipe_drivers (tickets/DM-5681=) $ gt sub
            commit c2b27106ef3825ae907fe1de88761cf9ec2f1283
            Author: Paul Price <price@astro.princeton.edu>
            Date:   Fri Apr 29 14:24:44 2016 -0400
             
                singleFrameDriver: convert to using BatchParallelTask
                
                We don't do any scatter/gather workflows, so all we really want
                is a multi-node version of the multiprocessing.Pool, which
                BatchParallelTask provides.
             
             python/lsst/pipe/drivers/singleFrameDriver.py | 97 ++++-----------------------
             1 file changed, 13 insertions(+), 84 deletions(-)
            

            Show
            price Paul Price added a comment - Thanks for volunteering to review this, Simon Krughoff ! There is a single commit in each of ctrl_pool and pipe_drivers. pprice@tiger-sumire:/tigress/pprice/dm-5681/ctrl_pool (tickets/DM-5681=) $ git sub commit d00b4eb14b749a16b2a9748ed59f35f2c682c5ea Author: Paul Price <price@astro.princeton.edu> Date: Fri Apr 29 14:21:48 2016 -0400   add BatchParallelTask to provide simple iteration This new subclass of BatchCmdLineTask uses the MPI process pool as a mere iteration device, like a multi-node version of the '-j' command-line argument. Contrast this with BatchPoolTask, which allows the user to use the process pool directly (so they can do their own scatter/gather workflows).   python/lsst/ctrl/pool/parallel.py | 73 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 71 insertions(+), 2 deletions(-)     upprice@tiger-sumire:/tigress/pprice/dm-5681/pipe_drivers (tickets/DM-5681=) $ gt sub commit c2b27106ef3825ae907fe1de88761cf9ec2f1283 Author: Paul Price <price@astro.princeton.edu> Date: Fri Apr 29 14:24:44 2016 -0400   singleFrameDriver: convert to using BatchParallelTask We don't do any scatter/gather workflows, so all we really want is a multi-node version of the multiprocessing.Pool, which BatchParallelTask provides.   python/lsst/pipe/drivers/singleFrameDriver.py | 97 ++++----------------------- 1 file changed, 13 insertions(+), 84 deletions(-)
            Hide
            krughoff Simon Krughoff added a comment -

            This looks fine. Just a couple comments. I don't have any way to test this.

            Show
            krughoff Simon Krughoff added a comment - This looks fine. Just a couple comments. I don't have any way to test this.
            Hide
            price Paul Price added a comment -

            Thanks Simon Krughoff!

            I added some more documentation (parameters, outputs of the function you identified), and justified my use of super on the github PR (it seems unavoidable). Then I gave it another test, and I'm happy to see that I can now parallelise wider than 112 cores in a single job.

            Merged to master.

            Show
            price Paul Price added a comment - Thanks Simon Krughoff ! I added some more documentation (parameters, outputs of the function you identified), and justified my use of super on the github PR (it seems unavoidable). Then I gave it another test, and I'm happy to see that I can now parallelise wider than 112 cores in a single job. Merged to master.

              People

              • Assignee:
                price Paul Price
                Reporter:
                swinbank John Swinbank
                Reviewers:
                Simon Krughoff
                Watchers:
                John Swinbank, Paul Price, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: