Status: Won't Fix
Fix Version/s: None
Component/s: pipe_analysis, QA
Team:Data Release Production
As running pipe_analysis scripts on significant chunks of data becomes more common, it becomes more and more desirable for the scripts to be able to use cluster computing resources in the same ways that other driver scripts do. It seems like the tasks could be refactored to inherit from BatchParallelTask rather than just CmdLineTask. As of now, there doesn't seem to be a way to parallelize the computations/plot generation; Lauren MacArthur tested using the -j flag and that seemed to bomb on the cluster at lsst-dev.
Currently, since the processing happens in serial, whenever I want to run the scripts on a significant chunk of data, I end up launching a slurm job array, with each subjob requesting a single core---which ends up clogging up the cluster. If IHS-576 gets implemented, that will help, but perhaps this refactoring would not be difficult and could get done sooner?
- mentioned in
After the gen3 migration, it'll be trivially parallelizable.