Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-15257

Refactor argument parsing to work with drivers that aren't CmdLineTask

    XMLWordPrintable

    Details

    • Story Points:
      12
    • Team:
      Data Access and Database

      Description

      I'd like to start looking at refactoring the argument parsing infrastructure in pipe_base to be more broadly useful as we transition away from CmdLineTask.  That includes (but is not limited to) making it useful for PipelineTask execution - there will be some command-line driver scripts in Gen3 (such as those for the ingest tasks I'm working on now on DM-15189) that will want a Gen3 Butler and/or to parse config overrides without actually being actual PipelineTasks.

      The specific goal I have is to have a library of argument parser components that allow users to do any combination of the following without assuming that any of them necessarily go together:

      • Construct a Gen2 Butler.
      • Construct a Gen3 Butler.
      • Accept a Gen2 Data ID expression.
      • Accept a Gen3 Data ID expression.
      • Accept pex_config overrides for both Pipelines and Tasks.
      • Set up and control loggers and diagnostic displays/outputs.

      Not all of these components need to go in pipe_base (the Gen3 Data ID parser, at least, absolutely should not), but I'd like them to be usable (and mix-and-match-able) in roughly the same way.  It's also worth pointing out that Andy Salnikov already has some command-line parsing code for configs in pipe_supertask right now that is at least heavily derived from what's in pipe_base, and that it may not be possible for that to use the exact same command-line syntax we use for CmdLineTasks today (because there is no single top-level Task/Config in the PipelineTask world).  I'd like for this ticket to include refactoring that code as well (while consulting with Andy Salnikov to make sure it still meets his needs and does not disrupt his other work, of course).

      A major part of this ticket is extracting information on the kinds of things that need to be command-line configurable from each of the above.  Please ask on #gen3-middleware on Slack if you're not sure who the right person to extract any of that information from.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            In the process, I'd love to set up a couple of systems parallel to config in order to configure operational parameters (e.g., overwrite/clobber)

            There have been some discussions of how these kinds of configuration will work in Gen3 (see e.g. https://confluence.lsstcorp.org/display/DM/SuperTask+Execution+Steps+and+Shared+Components) that haven't really converged mostly because no one has had time to make them their top priorities.  For this ticket, whatever general system for argument parsing we come up with absolutely needs to be extensible to whatever that system is, but I'm not ready to try to say what that system is yet.

            and debugging (replacing or supplementing lsstDebug).

            I'm expecting the QAWG to say something about this.  Another area where we need to be extensible but I don't know what we'll do yet.

            Show
            jbosch Jim Bosch added a comment - In the process, I'd love to set up a couple of systems parallel to  config  in order to configure operational parameters (e.g.,  overwrite / clobber ) There have been some discussions of how these kinds of configuration will work in Gen3 (see e.g. https://confluence.lsstcorp.org/display/DM/SuperTask+Execution+Steps+and+Shared+Components)  that haven't really converged mostly because no one has had time to make them their top priorities.  For this ticket, whatever general system for argument parsing we come up with absolutely needs to be extensible to whatever that system is, but I'm not ready to try to say what that system is yet. and debugging (replacing or supplementing  lsstDebug ). I'm expecting the QAWG to say something about this.  Another area where we need to be extensible but I don't know what we'll do yet.
            Hide
            tjenness Tim Jenness added a comment -

            This ticket has not been commented on for 18 months. How much of this is still critical for gen2 deprecation?

            Show
            tjenness Tim Jenness added a comment - This ticket has not been commented on for 18 months. How much of this is still critical for gen2 deprecation?
            Hide
            jbosch Jim Bosch added a comment - - edited

            What this ticket needs to do is totally driven by what DM-21898 needs from it. We should certainly consider just writing a new thing that doesn't do anything with Gen2 butlers or data IDSs, and not worry about code duplication with the existing argument parser on the basis that we will just drop it eventually.

            Show
            jbosch Jim Bosch added a comment - - edited What this ticket needs to do is totally driven by what DM-21898 needs from it. We should certainly consider just writing a new thing that doesn't do anything with Gen2 butlers or data IDSs, and not worry about code duplication with the existing argument parser on the basis that we will just drop it eventually.
            Hide
            tjenness Tim Jenness added a comment -

            Should we close this ticket? In some sense, it might be nice to remove some of the click options that are in ctrl_mpexec down into pipe_base to allow broader usage of things like config overrides but on the other hand people can use ctrl_mpexec.

            Show
            tjenness Tim Jenness added a comment - Should we close this ticket? In some sense, it might be nice to remove some of the click options that are in ctrl_mpexec down into pipe_base to allow broader usage of things like config overrides but on the other hand people can use ctrl_mpexec.
            Hide
            jbosch Jim Bosch added a comment -

            Yeah, I don't think we have any kind of established guidelines for what goes in pipe_base vs. ctrl_mpexec - ctrl_mpexec now brings in no new DM stack code or (significant) third-party Python packages, and if anything I'd say we should just move it all to pipe_base (while still splitting bare Task off to task_base, with no daf_butler dependency). But even that's a lot of churn for not much gain. In any case, I don't think there's anything sufficiently obviously good to repurpose this ticket to mean, so I'm just closing it as Won't Fix.

            Show
            jbosch Jim Bosch added a comment - Yeah, I don't think we have any kind of established guidelines for what goes in pipe_base vs. ctrl_mpexec - ctrl_mpexec now brings in no new DM stack code or (significant) third-party Python packages, and if anything I'd say we should just move it all to pipe_base (while still splitting bare Task off to task_base, with no daf_butler dependency). But even that's a lot of churn for not much gain. In any case, I don't think there's anything sufficiently obviously good to repurpose this ticket to mean, so I'm just closing it as Won't Fix.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              jbosch Jim Bosch
              Watchers:
              Andy Salnikov, Christopher Waters, Jim Bosch, John Parejko, Krzysztof Findeisen, Paul Price, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.