Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: obs_decam, obs_lsst, obs_subaru
-
Labels:
-
Story Points:2
-
Epic Link:
-
Team:Data Release Production
-
Urgent?:No
Description
When more people start using Gen3, we want them to find pipeline definitions in a long-term stable location, to avoid a lot of bad cargo-culting from the beginning. At present, the only truly usable pipeline definition (in the sense that it includes all necessary overloads) is in ci_hsc_gen3, while pipe_tasks includes many pipeline definitions that look usable but actually aren't.
Here's one proposal for how to fix that, involving only naming conventions and not new code:
- Move/split ci_hsc_gen3/pipelines/CiHsc.yaml to obs_subaru/pipelines/DataReleaseProduction.yaml
- Create similar definitions in obs_lsst (for at least imSim) and obs_decam
- Rename pipe_tasks/pipelines/DataReleaseProduction.yaml to give it a leading underscore, indicating that it is intended only for inheritance or composition in other pipelines, not direct execution.
- Remove ProcessCcd.yaml, Forced.yaml, Multiband.yaml, and Coaddition.yaml from pipe_tasks (merging them into DataReleaseProduction.yaml), or rename them with an underscore prefixes as well. Since I don't think we want to also make obs_* versions of these, I would prefer to just remove them.
- Consider renaming any [_]DataReleaseProduction.yaml to [_]DRP.yaml; given how often this will be typed on the command line, I suspect this move would be appreciated.
This of course raises (at least) two questions about some functionality changes:
- Should we (also) have a non-convention way to mark a pipeline definition as "abstract", and not directly runnable, to further prevent devs from thinking they can run a pipe_tasks pipeline just by passing --instrument to the pipetask invocation? I think we should at least use an underscore-prefix convention (that's very visible, and visibility is good here), but if we can easily do more than that, it could prevent some cases of silently incorrect processing (in the sense that important configs could be missed), and that's a sufficiently serious issue that we should guard carefully against it.
- Should we have a way to label sub-pipelines within a Pipeline definition? I think we clearly need one; the original intent of the sub-pipelines in pipe_tasks was good, even if it didn't work out.
- Should we remove the instrument command-line argument from pipetask entirely? If we are encouraging pipeline yaml files to contain per-instrument config overrides directly ever (and I think Nate Lust is on record saying we should prefer that to config-file overrides in obs_ packages, in the long-term) then it is never safe to assume that just passing --instrument to pipetask is sufficient.
I am open to solving these problems via functionality changes instead of just moving pipelines around, if it's easy enough, but we need to do something before Nov. 1.
Attachments
Issue Links
- mentioned in
-
Page Loading...
I don't think that we should tell people to never use the pipe_task pipelines. They can be useful when wanting to build your own, or to not bring in any overrides when trying to run something. I do think the _ convention does convey "use only if you know what you are doing"
What do you have against sub pipelines defined that then get integrated into a final one? I think it will make it MUCH more readable that having long files. I really don't see the advantage to changing this to labels within a longer document. I do not know why we would not make obs_* versions of these too. It is also possible that different obs packages will chose to make them, and some may bundle them differently.
I know I would prefer to drop the instrument label, but I know that some people who have poked at gen3 do like it (and specifying the task with -t) I suspect that is because pipelines need time to become familiar.
I like the name DRP, but in general we were encouraging people to use class name conventions with pipelines to get some sort of uniformity in naming. We could make an exception for DRP because it is so commonly used, but I fear the wild west if there is no convention.