Details
-
Type:
RFC
-
Status: Implemented
-
Resolution: Done
-
Component/s: DM
-
Labels:None
-
Location:this issue page
Description
As part of porting pipeline code from the HSC fork of the stack, we'll need more parallelization features than are currently available in the LSST codebase. Our requirements and the HSC solutions are described here:
https://confluence.lsstcorp.org/display/DM/S15+Short-Term+Parallelization+Middleware+Requirements
I propose that we simply bring the HSC framework over with very little modification - essentially nomenclature and code-cleanup changes only, into a new LSST package. This new package would depend on mpi4py, which would be a new third-party package included in the stack. I propose we call the new package ctrl_pool, but I'd be very happy to hear other suggestions.
New parallel driver scripts that rely on ctrl_pool would not go in pipe_tasks; we'd add a new package for these as well (pipe_drivers?). For the most part, these would delegate their work to conventional CmdLineTasks in pipe_tasks that could be run manually on smaller scales, but this may not be entirely possible for all pipelines.
We do not anticipate this being the long-term solution for our parallel execution framework, but we believe the concepts are general enough and the interface abstract enough that it should be fairly easy to modify pipeline code to adapt to a new framework in the future.
So far, this RFC is all crickets, and I'd like to get started on the implementation. But it's a pretty broad change, and while we've discussed it vaguely a lot, we've haven't discussed the details at all. Kian-Tat Lim, should I go ahead and accept this, and assume we'll discuss changes after the HSC prototype is on master, or try to find some RFD time for it before Bremerton?