# Fix multi-process setup for CmdLineFwk

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
2
• Sprint:
BG3_F18_10
• Team:
Data Access and Database

#### Description

I did not check multi-process option for some time and now it breaks:

 Traceback (most recent call last):  File "/project/salnikov/DM-15686/pipe_supertask/bin/stac", line 25, in   sys.exit(CmdLineFwk().parseAndRun())  File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 226, in parseAndRun  return self.runPipeline(qgraph, butler, args)  File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 372, in runPipeline  mapFunc(self._executePipelineTask, target_list)  File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 96, in __call__  return result.get(self.timeout)  File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/pool.py", line 644, in get  raise self._value  File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks  put(task)  File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/connection.py", line 206, in send  self._send_bytes(_ForkingPickler.dumps(obj))  File "/software/lsstsw/stack3_20171023/python/miniconda3-4.3.21/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps  cls(buf, protocol).dump(obj) TypeError: can't pickle lsst.log.log.log.Log objects 

Need to fix it ASAP

#### Activity

Andy Salnikov added a comment - The crash happens obviously when multiprocessing tries to pickle Log class which is not pickable. This probably means that there is more being pickled than I was expecting, need to understand what is being passed to a forked method execution.
 Traceback (most recent call last):  File "/project/salnikov/DM-15686/pipe_supertask/bin/stac", line 25, in   sys.exit(CmdLineFwk().parseAndRun())  File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 226, in parseAndRun  return self.runPipeline(qgraph, butler, args)  File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 372, in runPipeline  mapFunc(self._executePipelineTask, target_list)  File "/project/salnikov/DM-15686/pipe_supertask/python/lsst/pipe/supertask/cmdLineFwk.py", line 96, in __call__  return result.get(self.timeout)  File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/pool.py", line 644, in get  raise self._value  File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks  put(task)  File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/connection.py", line 206, in send  self._send_bytes(_ForkingPickler.dumps(obj))  File "/software/lsstsw/stack_20181012/python/miniconda3-4.5.4/envs/lsst-scipipe/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps  cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle : attribute lookup StorageClassDecoratedImageU on lsst.daf.butler.core.storageClass failed 

Something is going on with storage classes which pickle does not like.

Andy Salnikov added a comment - Our storage classes are generated dynamically from YAML configuration, this is why they cannot be found in the module by pickle. That makes the whole quanta non-pickable because Quantum has DatasetType instance with StorageClass in it. We need to do something non-trivial to make Quanta pickable, I guess, maybe provide specialized serialization for StorageClasses?
Tim Jenness added a comment - Since the only thing that matters is the name of the storage class, you could pickle by pickling the name and unpickle by retrieving it from StorageClassFactory.
Andy Salnikov added a comment - Indeed, this is how I implemented pickling code for DatasetType class, and it seems to work OK, except that it requires StorageClass to be registered in factory. I guess all of our StorageClasses are supposed to be done that way, but for example in unit tests we don't do that. In general I wonder if we can simplify things by decoupling DatasetType from StorageClass and having mapping between DatasetType (names) and StorageClass somewhere in the Butler. I had also to modify pickle support code in Butler class to support case when "run" is missing from butler config file (as it is the case for ci_hsc).
Andy Salnikov added a comment - Ready for review, most changes are trivial (replacing lsst.log with logging) but there is small piece of code that fixes pickle support which I hope is done right now. Unit tests were added for that. Jenkins build passes, no big rush with that, can wait.
Andy Salnikov added a comment - Tim Jenness , thanks for flake8 fix. I have rebased branch on new master, now passes both Travis and Jenkins. Do you plan to review pipe_supertask too or do you trust me on that? If latter then you can just mark the ticket as reviewed, I'll merge both packages.
Tim Jenness added a comment - Ok. I had a quick look at pipe_supertask. Looks fine, although in daf_butler I have not been stripping lsst from the name when using the logging package. I hadn't thought about it.
Andy Salnikov added a comment - Thanks, I'll merge things then. Developer guide says we are supposed to strip "lsst.":  https://developer.lsst.io/stack/logging.html#logger-names
Andy Salnikov added a comment - Merged and done

#### People

Assignee:
Andy Salnikov
Reporter:
Andy Salnikov
Reviewers:
Tim Jenness
Watchers:
Andy Salnikov, Jim Bosch, Tim Jenness, Vaikunth Thukral