Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7134

singleFrameDriver is only running with a single process

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: pipe_drivers
    • Labels:
      None
    • Story Points:
      0.5
    • Epic Link:
    • Sprint:
      DRP F16-3
    • Team:
      Data Release Production

      Description

      singleFrameDriver.py is only using a single process. The problem appears to be the change to use a ButlerInitializedTaskRunner, which doesn't inherit from BatchTaskRunner.

        Attachments

          Issue Links

            Activity

            Hide
            price Paul Price added a comment -

            Jim Bosch, would you please review this? Here's the patch:

            pprice@tiger-sumire:~/LSST/pipe/drivers (tickets/DM-7134=) $ git sub-patch
            commit a04ae120c7ae27811bebcf7e2b4036ea8c6c039c
            Author: Paul Price <price@astro.princeton.edu>
            Date:   Thu Aug 4 15:51:41 2016 -0400
             
                singleFrameDriver: restore lost parallelism
                
                The ButlerInitializedTaskRunner used to run the SingleFrameDriverTask
                doesn't inherit from the BatchTaskRunner, which is what provides
                parallelism. Made a new SingleFrameTaskRunner with the characteristics
                of both BatchTaskRunner and ButlerInitializedTaskRunner using
                multiple inheritance (safe without further work because the
                specialisations in those TaskRunners are orthogonal).
             
            diff --git a/python/lsst/pipe/drivers/singleFrameDriver.py b/python/lsst/pipe/drivers/singleFrameDriver.py
            index b13f64e..60257d2 100644
            --- a/python/lsst/pipe/drivers/singleFrameDriver.py
            +++ b/python/lsst/pipe/drivers/singleFrameDriver.py
            @@ -1,7 +1,7 @@
             from lsst.pipe.base import ArgumentParser, ButlerInitializedTaskRunner
             from lsst.pipe.tasks.processCcd import ProcessCcdTask
             from lsst.pex.config import Config, Field, ConfigurableField, ListField
            -from lsst.ctrl.pool.parallel import BatchParallelTask
            +from lsst.ctrl.pool.parallel import BatchParallelTask, BatchTaskRunner
             
             class SingleFrameDriverConfig(Config):
                 processCcd = ConfigurableField(target=ProcessCcdTask, doc="CCD processing task")
            @@ -9,12 +9,17 @@ class SingleFrameDriverConfig(Config):
                 ccdKey = Field(dtype=str, default="ccd", doc="DataId key corresponding to a single sensor")
             
             
            +class SingleFrameTaskRunner(BatchTaskRunner, ButlerInitializedTaskRunner):
            +    """Run batches, and initialize Task using a butler"""
            +    pass
            +
            +
             class SingleFrameDriverTask(BatchParallelTask):
                 """Process CCDs in parallel
                 """
                 ConfigClass = SingleFrameDriverConfig
                 _DefaultName = "singleFrameDriver"
            -    RunnerClass = ButlerInitializedTaskRunner
            +    RunnerClass = SingleFrameTaskRunner
             
                 def __init__(self, butler=None, refObjLoader=None, *args, **kwargs):
                     """!
            

            Show
            price Paul Price added a comment - Jim Bosch , would you please review this? Here's the patch: pprice@tiger-sumire:~/LSST/pipe/drivers (tickets/DM-7134=) $ git sub-patch commit a04ae120c7ae27811bebcf7e2b4036ea8c6c039c Author: Paul Price <price@astro.princeton.edu> Date: Thu Aug 4 15:51:41 2016 -0400   singleFrameDriver: restore lost parallelism The ButlerInitializedTaskRunner used to run the SingleFrameDriverTask doesn't inherit from the BatchTaskRunner, which is what provides parallelism. Made a new SingleFrameTaskRunner with the characteristics of both BatchTaskRunner and ButlerInitializedTaskRunner using multiple inheritance (safe without further work because the specialisations in those TaskRunners are orthogonal).   diff --git a/python/lsst/pipe/drivers/singleFrameDriver.py b/python/lsst/pipe/drivers/singleFrameDriver.py index b13f64e..60257d2 100644 --- a/python/lsst/pipe/drivers/singleFrameDriver.py +++ b/python/lsst/pipe/drivers/singleFrameDriver.py @@ -1,7 +1,7 @@ from lsst.pipe.base import ArgumentParser, ButlerInitializedTaskRunner from lsst.pipe.tasks.processCcd import ProcessCcdTask from lsst.pex.config import Config, Field, ConfigurableField, ListField -from lsst.ctrl.pool.parallel import BatchParallelTask +from lsst.ctrl.pool.parallel import BatchParallelTask, BatchTaskRunner class SingleFrameDriverConfig(Config): processCcd = ConfigurableField(target=ProcessCcdTask, doc="CCD processing task") @@ -9,12 +9,17 @@ class SingleFrameDriverConfig(Config): ccdKey = Field(dtype=str, default="ccd", doc="DataId key corresponding to a single sensor") +class SingleFrameTaskRunner(BatchTaskRunner, ButlerInitializedTaskRunner): + """Run batches, and initialize Task using a butler""" + pass + + class SingleFrameDriverTask(BatchParallelTask): """Process CCDs in parallel """ ConfigClass = SingleFrameDriverConfig _DefaultName = "singleFrameDriver" - RunnerClass = ButlerInitializedTaskRunner + RunnerClass = SingleFrameTaskRunner def __init__(self, butler=None, refObjLoader=None, *args, **kwargs): """!
            Hide
            jbosch Jim Bosch added a comment - - edited

            My only concern is that I think you might need to modify BatchTaskRunner.__init__ to use super to ensure that TaskRunner.__init__ is invoked properly; it'd certainly be good practice to do so, even if we're skating by on an edge case. In fact, I'd feel a bit safer if all three derived classes explicitly did that, but I've got to imagine Python also handles multiple inheritance correctly if you don't define __init__ at all.

            Show
            jbosch Jim Bosch added a comment - - edited My only concern is that I think you might need to modify BatchTaskRunner.__init__ to use super to ensure that TaskRunner.__init__ is invoked properly; it'd certainly be good practice to do so, even if we're skating by on an edge case. In fact, I'd feel a bit safer if all three derived classes explicitly did that, but I've got to imagine Python also handles multiple inheritance correctly if you don't define __init__ at all.
            Hide
            price Paul Price added a comment -

            I don't think that's necessary.

            • BatchTaskRunner overrides __init__, run and __call__.
            • ButlerInitializedTaskRunner overrides only makeTask.

            These lists are orthogonal, so the correct parent should be firing without having to worry about the MRO. Since you mentioned it explictly, take the example of __init__:

            • SingleFrameTaskRunner.__init__ isn't defined, so we go looking up the MRO.
            • BatchTaskRunner is next in the MRO, and it defines __init__, so that fires.
            • That explicitly calls TaskRunner.__init__ (not using super).
            • That doesn't call any other __init__ method.
              Note that nothing is calling ButlerInitializedTaskRunner.__init__, and that's good because it doesn't override that method. So I think everything fires that needs to, and nothing fires that shouldn't. If we used super somewhere but not everywhere, that would create trouble, but I think this does the right thing now without any changes required.

            Please let me know if you disagree or still have concerns. I'll wait on your final OK before merging.

            Show
            price Paul Price added a comment - I don't think that's necessary. BatchTaskRunner overrides __init__ , run and __call__ . ButlerInitializedTaskRunner overrides only makeTask . These lists are orthogonal, so the correct parent should be firing without having to worry about the MRO. Since you mentioned it explictly, take the example of __init__ : SingleFrameTaskRunner.__init__ isn't defined, so we go looking up the MRO. BatchTaskRunner is next in the MRO, and it defines __init__ , so that fires. That explicitly calls TaskRunner.__init__ (not using super ). That doesn't call any other __init__ method. Note that nothing is calling ButlerInitializedTaskRunner.__init__ , and that's good because it doesn't override that method. So I think everything fires that needs to, and nothing fires that shouldn't. If we used super somewhere but not everywhere, that would create trouble, but I think this does the right thing now without any changes required. Please let me know if you disagree or still have concerns. I'll wait on your final OK before merging.
            Hide
            jbosch Jim Bosch added a comment -

            I don't disagree with your analysis that the current code is safe, but I still worry that it's fragile w.r.t. the addition of new subclasses. That said, TaskRunner is perhaps not long for this world with SuperTask on the horizon, so it's probably not necessary to fix this now.

            Show
            jbosch Jim Bosch added a comment - I don't disagree with your analysis that the current code is safe, but I still worry that it's fragile w.r.t. the addition of new subclasses. That said, TaskRunner is perhaps not long for this world with SuperTask on the horizon, so it's probably not necessary to fix this now.
            Hide
            price Paul Price added a comment -

            Merged to master.

            Show
            price Paul Price added a comment - Merged to master.
            Hide
            price Paul Price added a comment -

            Thanks, I'll go ahead and merge now.

            I'm not sure how this is any more fragile than usual, nor how super would fix it (using super in a class hierarchy that doesn't all use super is dangerous, and I thought you were against using it for that reason); perhaps you could explain that offline for me?

            Show
            price Paul Price added a comment - Thanks, I'll go ahead and merge now. I'm not sure how this is any more fragile than usual, nor how super would fix it (using super in a class hierarchy that doesn't all use super is dangerous, and I thought you were against using it for that reason); perhaps you could explain that offline for me?
            Hide
            jbosch Jim Bosch added a comment -

            To be honest, while I do remember being unhappy with super in the past, I don't remember the arguments for and against anymore, and I don't think that unease extended to diamond inheritance when all classes have a nontrivial constructors; I think super is till the only way to handle that situation correctly.

            I wonder what the situation is in Python 3 (or futurized Python 2), where I hear super is better - Tim Jenness?

            Show
            jbosch Jim Bosch added a comment - To be honest, while I do remember being unhappy with super in the past, I don't remember the arguments for and against anymore, and I don't think that unease extended to diamond inheritance when all classes have a nontrivial constructors; I think super is till the only way to handle that situation correctly. I wonder what the situation is in Python 3 (or futurized Python 2), where I hear super is better - Tim Jenness ?
            Show
            tjenness Tim Jenness added a comment - - edited Python 3 super docs are at https://docs.python.org/3/library/functions.html?highlight=super#super which then tells you to read https://rhettinger.wordpress.com/2011/05/26/super-considered-super/

              People

              Assignee:
              price Paul Price
              Reporter:
              price Paul Price
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, Paul Price, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.