When more people start using Gen3, we want them to find pipeline definitions in a long-term stable location, to avoid a lot of bad cargo-culting from the beginning. At present, the only truly usable pipeline definition (in the sense that it includes all necessary overloads) is in ci_hsc_gen3, while pipe_tasks includes many pipeline definitions that look usable but actually aren't.
Here's one proposal for how to fix that, involving only naming conventions and not new code:
- Move/split ci_hsc_gen3/pipelines/CiHsc.yaml to obs_subaru/pipelines/DataReleaseProduction.yaml
- Create similar definitions in obs_lsst (for at least imSim) and obs_decam
- Rename pipe_tasks/pipelines/DataReleaseProduction.yaml to give it a leading underscore, indicating that it is intended only for inheritance or composition in other pipelines, not direct execution.
- Remove ProcessCcd.yaml, Forced.yaml, Multiband.yaml, and Coaddition.yaml from pipe_tasks (merging them into DataReleaseProduction.yaml), or rename them with an underscore prefixes as well. Since I don't think we want to also make obs_* versions of these, I would prefer to just remove them.
- Consider renaming any [_]DataReleaseProduction.yaml to [_]DRP.yaml; given how often this will be typed on the command line, I suspect this move would be appreciated.
This of course raises (at least) two questions about some functionality changes:
- Should we (also) have a non-convention way to mark a pipeline definition as "abstract", and not directly runnable, to further prevent devs from thinking they can run a pipe_tasks pipeline just by passing --instrument to the pipetask invocation? I think we should at least use an underscore-prefix convention (that's very visible, and visibility is good here), but if we can easily do more than that, it could prevent some cases of silently incorrect processing (in the sense that important configs could be missed), and that's a sufficiently serious issue that we should guard carefully against it.
- Should we have a way to label sub-pipelines within a Pipeline definition? I think we clearly need one; the original intent of the sub-pipelines in pipe_tasks was good, even if it didn't work out.
- Should we remove the instrument command-line argument from pipetask entirely? If we are encouraging pipeline yaml files to contain per-instrument config overrides directly ever (and I think Nate Lust is on record saying we should prefer that to config-file overrides in obs_ packages, in the long-term) then it is never safe to assume that just passing --instrument to pipetask is sufficient.
I am open to solving these problems via functionality changes instead of just moving pipelines around, if it's easy enough, but we need to do something before Nov. 1.