Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-16077

Fix multi-process setup for CmdLineFwk

    XMLWordPrintable

    Details

      Description

      I did not check multi-process option for some time and now it breaks:

      Traceback (most recent call last):
        File "/project/salnikov/DM-15686/pipe_supertask/bin/stac", line 25, in <module>
          sys.exit(CmdLineFwk().parseAndRun())
        File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 226, in parseAndRun
          return self.runPipeline(qgraph, butler, args)
        File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 372, in runPipeline
          mapFunc(self._executePipelineTask, target_list)
        File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 96, in __call__
          return result.get(self.timeout)
        File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/pool.py", line 644, in get
          raise self._value
        File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
          put(task)
        File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/connection.py", line 206, in send
          self._send_bytes(_ForkingPickler.dumps(obj))
        File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
          cls(buf, protocol).dump(obj)
      TypeError: can't pickle lsst.log.log.log.Log objects
      

      Need to fix it ASAP
       

        Attachments

          Issue Links

            Activity

            Hide
            salnikov Andy Salnikov added a comment -

            The crash happens obviously when multiprocessing tries to pickle Log class which is not pickable. This probably means that there is more being pickled than I was expecting, need to understand what is being passed to a forked method execution.

            Show
            salnikov Andy Salnikov added a comment - The crash happens obviously when multiprocessing tries to pickle Log class which is not pickable. This probably means that there is more being pickled than I was expecting, need to understand what is being passed to a forked method execution.
            Hide
            salnikov Andy Salnikov added a comment -

            After few trivial improvements to reduce number of things that are passed to the multiprocess target method I get this exception:

            Traceback (most recent call last):
              File "/project/salnikov/DM-15686/pipe_supertask/bin/stac", line 25, in <module>
                sys.exit(CmdLineFwk().parseAndRun())
              File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 226, in parseAndRun
                return self.runPipeline(qgraph, butler, args)
              File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 372, in runPipeline
                mapFunc(self._executePipelineTask, target_list)
              File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 96, in __call__
                return result.get(self.timeout)
              File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/pool.py", line 644, in get
                raise self._value
              File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
                put(task)
              File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/connection.py", line 206, in send
                self._send_bytes(_ForkingPickler.dumps(obj))
              File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
                cls(buf, protocol).dump(obj)
            _pickle.PicklingError: Can't pickle <class 'lsst.daf.butler.core.storageClass.StorageClassDecoratedImageU'>: attribute lookup StorageClassDecoratedImageU on lsst.daf.butler.core.storageClass failed
            

            Something is going on with storage classes which pickle does not like.

            Show
            salnikov Andy Salnikov added a comment - After few trivial improvements to reduce number of things that are passed to the multiprocess target method I get this exception: Traceback (most recent call last): File "/project/salnikov/DM-15686/pipe_supertask/bin/stac", line 25, in <module> sys.exit(CmdLineFwk().parseAndRun()) File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 226, in parseAndRun return self.runPipeline(qgraph, butler, args) File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 372, in runPipeline mapFunc(self._executePipelineTask, target_list) File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 96, in __call__ return result.get(self.timeout) File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks put(task) File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/connection.py", line 206, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'lsst.daf.butler.core.storageClass.StorageClassDecoratedImageU'>: attribute lookup StorageClassDecoratedImageU on lsst.daf.butler.core.storageClass failed Something is going on with storage classes which pickle does not like.
            Hide
            salnikov Andy Salnikov added a comment -

            Our storage classes are generated dynamically from YAML configuration, this is why they cannot be found in the module by pickle. That makes the whole quanta non-pickable because Quantum has DatasetType instance with StorageClass in it. We need to do something non-trivial to make Quanta pickable, I guess, maybe provide specialized serialization for StorageClasses?

            Show
            salnikov Andy Salnikov added a comment - Our storage classes are generated dynamically from YAML configuration, this is why they cannot be found in the module by pickle. That makes the whole quanta non-pickable because Quantum has DatasetType instance with StorageClass in it. We need to do something non-trivial to make Quanta pickable, I guess, maybe provide specialized serialization for StorageClasses?
            Hide
            tjenness Tim Jenness added a comment -

            Since the only thing that matters is the name of the storage class, you could pickle by pickling the name and unpickle by retrieving it from StorageClassFactory.

            Show
            tjenness Tim Jenness added a comment - Since the only thing that matters is the name of the storage class, you could pickle by pickling the name and unpickle by retrieving it from StorageClassFactory.
            Hide
            salnikov Andy Salnikov added a comment -

            Indeed, this is how I implemented pickling code for DatasetType class, and it seems to work OK, except that it requires StorageClass to be registered in factory. I guess all of our StorageClasses are supposed to be done that way, but for example in unit tests we don't do that. In general I wonder if we can simplify things by decoupling DatasetType from StorageClass and having mapping between DatasetType (names) and StorageClass somewhere in the Butler.

            I had also to modify pickle support code in Butler class to support case when "run" is missing from butler config file (as it is the case for ci_hsc).

            Show
            salnikov Andy Salnikov added a comment - Indeed, this is how I implemented pickling code for DatasetType class, and it seems to work OK, except that it requires StorageClass to be registered in factory. I guess all of our StorageClasses are supposed to be done that way, but for example in unit tests we don't do that. In general I wonder if we can simplify things by decoupling DatasetType from StorageClass and having mapping between DatasetType (names) and StorageClass somewhere in the Butler. I had also to modify pickle support code in Butler class to support case when "run" is missing from butler config file (as it is the case for ci_hsc).
            Hide
            salnikov Andy Salnikov added a comment -

            Ready for review, most changes are trivial (replacing lsst.log with logging) but there is small piece of code that fixes pickle support which I hope is done right now. Unit tests were added for that.
            Jenkins build passes, no big rush with that, can wait.

            Show
            salnikov Andy Salnikov added a comment - Ready for review, most changes are trivial (replacing lsst.log with logging) but there is small piece of code that fixes pickle support which I hope is done right now. Unit tests were added for that. Jenkins build passes, no big rush with that, can wait.
            Hide
            salnikov Andy Salnikov added a comment -

            Tim Jenness, thanks for flake8 fix. I have rebased branch on new master, now passes both Travis and Jenkins. Do you plan to review pipe_supertask too or do you trust me on that? If latter then you can just mark the ticket as reviewed, I'll merge both packages.

            Show
            salnikov Andy Salnikov added a comment - Tim Jenness , thanks for flake8 fix. I have rebased branch on new master, now passes both Travis and Jenkins. Do you plan to review pipe_supertask too or do you trust me on that? If latter then you can just mark the ticket as reviewed, I'll merge both packages.
            Hide
            tjenness Tim Jenness added a comment -

            Ok. I had a quick look at pipe_supertask. Looks fine, although in daf_butler I have not been stripping lsst from the name when using the logging package. I hadn't thought about it.

            Show
            tjenness Tim Jenness added a comment - Ok. I had a quick look at pipe_supertask. Looks fine, although in daf_butler I have not been stripping lsst from the name when using the logging package. I hadn't thought about it.
            Hide
            salnikov Andy Salnikov added a comment -

            Thanks, I'll merge things then. Developer guide says we are supposed to strip "lsst.": https://developer.lsst.io/stack/logging.html#logger-names

            Show
            salnikov Andy Salnikov added a comment - Thanks, I'll merge things then. Developer guide says we are supposed to strip "lsst.":  https://developer.lsst.io/stack/logging.html#logger-names
            Hide
            salnikov Andy Salnikov added a comment -

            Merged and done

            Show
            salnikov Andy Salnikov added a comment - Merged and done

              People

              Assignee:
              salnikov Andy Salnikov
              Reporter:
              salnikov Andy Salnikov
              Reviewers:
              Tim Jenness
              Watchers:
              Andy Salnikov, Jim Bosch, Tim Jenness, Vaikunth Thukral
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  CI Builds

                  No builds found.