Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-927

Clarify pipeline definition YAML directory structures

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      As a broader number of science users from outside of DM have begun using the Science Pipelines, our pipeline definition YAML storage structure has been under increasing scrutiny. A new structure was suggested and implemented in RFC-775, and then further clarified on Community following initial testing:

      The end result of these changes in drp_pipe was a directory structure that looks like this:

      drp_pipe/
        ingredients/
          base_generic1.yaml
          HSC/
            base_specific1.yaml
          DECam/
            base_specific2.yaml
        pipelines/
          generic1.yaml
          HSC/
            specific1.yaml
          DECam/
            specific2.yaml
      

      The pipeline YAMLs in the pipelines directory tend to import those in the ingredients directory. The pipelines directory YAML files may then be expanded for the purposes of visualization using pipetask build -p on the command line. The structure outlined above works well for the most part in DRP, however, this structure was not fully rolled out to both ap_pipe and cp_pipe at that time.

      In recent months we've seen an increasing number of pipeline usage issues crop up from science users in the wider community which seem to fall into one of two camps:

      1. Directly using YAMLs in the ingredients directory instead of one in the pipelines directory
      2. Using a generic pipeline YAML in the pipelines directory instead of a camera-specific pipeline YAML

      The purpose of this RFC is to agree upon a directory structure which aims to help mitigate these two failure modes from cropping up.

      I would like to propose the following three changes:

      Proposal 1

      The pipeline YAML storage structure implemented in DRP_PIPE should also be rolled out to both AP_PIPE and CP_PIPE (EDIT 2023-05-16: and also CP_VERIFY). Pipeline YAMLs which are not intended for end-users to directly use should be moved to a base/template style "ingredients" directory. Conversely, all science ready pipeline YAMLs which are designed for direct use when reducing data should live in the pipelines directory.

      This means that current files such as ap_pipe/pipelines/ApPipe.yaml and cp_pipe/pipelines/cpFlat.yaml will be moved into a separate "ingredients" directory. Removing these files from pipelines into ingredients will instead help guide the user to a camera-specific implementation of the above, helping to resolve failure mode #2.

      Proposal 2

      In order to clearly identify that "ingredients" YAMLs are "pipeline ingredients", I propose that the ingredients directories be moved to live underneath the pipelines directory. I.e.:

      drp_pipe/
        pipelines/
          ingredients/  # (or '_ingredients', see Proposal 3 below)
            base_generic1.yaml
            HSC/
              base_specific1.yaml
            DECam/
              base_specific2.yaml
          generic1.yaml
          HSC/
            specific1.yaml
          DECam/
            specific2.yaml
      

      In terms of discoverability, I think this would help users who are navigating through any *_pipe repo to not first stumble across the "ingredients" directory and begin using that, rather than finding the pipelines directory and use that instead.

      Proposal 3

      To emphasize that the ingredients directory should not be used directly, the name of this directory should be prefixed with a leading underscore, i.e.: _ingredients. This would make it obvious at a glance that this should not be used, helping to guide users to the pipelines directory instead. NB: This proposal also syncs well with Proposal 2 above, ensuring that the ingredients directory will be sorted to the top of the list in a standard file browser.

      I've set the planned end date for this RFC 1 week from today, which may be extended depending on how much discussion is generated. As the proposed changes discussed here are not user-facing (or shouldn't be, unless a user is using an "ingredients" YAML) then hopefully end users should not notice any difference in operations following these changes. Keen to hear your thoughts.

        Attachments

          Issue Links

            Activity

            Hide
            lskelvin Lee Kelvin added a comment -

            Thanks for catching my typo Krzysztof Findeisen, much appreciated. Fixed in both the description and my above comment.

            Happy to also include AP_VERIFY here as well, which I think probably makes sense.

            Show
            lskelvin Lee Kelvin added a comment - Thanks for catching my typo Krzysztof Findeisen , much appreciated. Fixed in both the description and my above comment. Happy to also include AP_VERIFY here as well, which I think probably makes sense.
            Hide
            czw Christopher Waters added a comment -

            If I had to choose, I think I'd slightly prefer Proposal 3, but I'm happy with whatever the consensus decision is.  I'll update the PRs for DM-39212 when we reach that decision.

            Show
            czw Christopher Waters added a comment - If I had to choose, I think I'd slightly prefer Proposal 3, but I'm happy with whatever the consensus decision is.  I'll update the PRs for DM-39212 when we reach that decision.
            Hide
            lskelvin Lee Kelvin added a comment -

            Apologies if I didn't make this clear in the original description: proposals 1, 2 and 3 above are not mutually exclusive. We can choose to adopt all three, none of them, or any combination between.

            Show
            lskelvin Lee Kelvin added a comment - Apologies if I didn't make this clear in the original description: proposals 1, 2 and 3 above are not mutually exclusive. We can choose to adopt all three, none of them, or any combination between.
            Hide
            lskelvin Lee Kelvin added a comment - - edited

            Thanks again all for the discussions here, and a follow-up discussion on Slack in #dm-naming-things. In summary, it looks like we have general support for all three proposals listed above, so I'll adopt this RFC on that basis. Going forward, this means:

            • drp_pipe, ap_pipe, cp_pipe, ap_verify and cp_verify will adopt the new directory hierarchy. Pipelines not intended to be directly used by end-users will be stored in an ingredients directory located at , e.g., *_PIPE/pipelines/_ingredients.
            • All user-facing pipelines will live in the, e.g., *_PIPE/pipelines directory. Those pipelines which have a camera-specific implementation will live in a camera-name sub-directory and are generally preferred (and their generic equivalents moved to the ingredients directory to help prevent their direct use).
            • The current top-level ingredients directories (where they exist) will be renamed with a leading underscore and moved underneath, e.g., *_PIPE/pipelines. New ingredients directories will be constructed where none currently exist.

            There are three implementation tickets for this RFC:

            Thanks again all for your thoughts, much appreciated!

            Show
            lskelvin Lee Kelvin added a comment - - edited Thanks again all for the discussions here, and a follow-up discussion on Slack in #dm-naming-things . In summary, it looks like we have general support for all three proposals listed above, so I'll adopt this RFC on that basis. Going forward, this means: drp_pipe, ap_pipe, cp_pipe, ap_verify and cp_verify will adopt the new directory hierarchy. Pipelines not intended to be directly used by end-users will be stored in an ingredients directory located at , e.g., *_PIPE/pipelines/_ingredients . All user-facing pipelines will live in the, e.g., *_PIPE/pipelines directory. Those pipelines which have a camera-specific implementation will live in a camera-name sub-directory and are generally preferred (and their generic equivalents moved to the ingredients directory to help prevent their direct use). The current top-level ingredients directories (where they exist) will be renamed with a leading underscore and moved underneath, e.g., *_PIPE/pipelines . New ingredients directories will be constructed where none currently exist. There are three implementation tickets for this RFC: cp_pipe & cp_verify: DM-39212 ap_pipe & ap_verify: DM-39214 drp_pipe: DM-39333 Thanks again all for your thoughts, much appreciated!
            Hide
            lskelvin Lee Kelvin added a comment - - edited

            All implementation tickets now merged and branches deleted. An additional README file cleanup ticket has also been merged on DM-39465. Thank you all for your thoughts on this RFC and to the reviewers for checking these edits. Marking this RFC as implemented. Also leaving a comment on RFC-775 linking back to this RFC for further information.

            Show
            lskelvin Lee Kelvin added a comment - - edited All implementation tickets now merged and branches deleted. An additional README file cleanup ticket has also been merged on DM-39465 . Thank you all for your thoughts on this RFC and to the reviewers for checking these edits. Marking this RFC as implemented. Also leaving a comment on RFC-775 linking back to this RFC for further information.

              People

              Assignee:
              lskelvin Lee Kelvin
              Reporter:
              lskelvin Lee Kelvin
              Watchers:
              Christopher Waters, Eli Rykoff, Eric Bellm, Ian Sullivan, Jim Bosch, Krzysztof Findeisen, Lee Kelvin, Meredith Rawls
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.