As a broader number of science users from outside of DM have begun using the Science Pipelines, our pipeline definition YAML storage structure has been under increasing scrutiny. A new structure was suggested and implemented in RFC-775, and then further clarified on Community following initial testing:
The end result of these changes in drp_pipe was a directory structure that looks like this:
The pipeline YAMLs in the pipelines directory tend to import those in the ingredients directory. The pipelines directory YAML files may then be expanded for the purposes of visualization using pipetask build -p on the command line. The structure outlined above works well for the most part in DRP, however, this structure was not fully rolled out to both ap_pipe and cp_pipe at that time.
In recent months we've seen an increasing number of pipeline usage issues crop up from science users in the wider community which seem to fall into one of two camps:
- Directly using YAMLs in the ingredients directory instead of one in the pipelines directory
- Using a generic pipeline YAML in the pipelines directory instead of a camera-specific pipeline YAML
The purpose of this RFC is to agree upon a directory structure which aims to help mitigate these two failure modes from cropping up.
I would like to propose the following three changes:
The pipeline YAML storage structure implemented in DRP_PIPE should also be rolled out to both AP_PIPE and CP_PIPE (EDIT 2023-05-16: and also CP_VERIFY). Pipeline YAMLs which are not intended for end-users to directly use should be moved to a base/template style "ingredients" directory. Conversely, all science ready pipeline YAMLs which are designed for direct use when reducing data should live in the pipelines directory.
This means that current files such as ap_pipe/pipelines/ApPipe.yaml and cp_pipe/pipelines/cpFlat.yaml will be moved into a separate "ingredients" directory. Removing these files from pipelines into ingredients will instead help guide the user to a camera-specific implementation of the above, helping to resolve failure mode #2.
In order to clearly identify that "ingredients" YAMLs are "pipeline ingredients", I propose that the ingredients directories be moved to live underneath the pipelines directory. I.e.:
In terms of discoverability, I think this would help users who are navigating through any *_pipe repo to not first stumble across the "ingredients" directory and begin using that, rather than finding the pipelines directory and use that instead.
To emphasize that the ingredients directory should not be used directly, the name of this directory should be prefixed with a leading underscore, i.e.: _ingredients. This would make it obvious at a glance that this should not be used, helping to guide users to the pipelines directory instead. NB: This proposal also syncs well with Proposal 2 above, ensuring that the ingredients directory will be sorted to the top of the list in a standard file browser.
I've set the planned end date for this RFC 1 week from today, which may be extended depending on how much discussion is generated. As the proposed changes discussed here are not user-facing (or shouldn't be, unless a user is using an "ingredients" YAML) then hopefully end users should not notice any difference in operations following these changes. Keen to hear your thoughts.