Status: Won't Fix
Fix Version/s: None
Team:Data Access and Database
I'd like to start looking at refactoring the argument parsing infrastructure in pipe_base to be more broadly useful as we transition away from CmdLineTask. That includes (but is not limited to) making it useful for PipelineTask execution - there will be some command-line driver scripts in Gen3 (such as those for the ingest tasks I'm working on now on
DM-15189) that will want a Gen3 Butler and/or to parse config overrides without actually being actual PipelineTasks.
The specific goal I have is to have a library of argument parser components that allow users to do any combination of the following without assuming that any of them necessarily go together:
- Construct a Gen2 Butler.
- Construct a Gen3 Butler.
- Accept a Gen2 Data ID expression.
- Accept a Gen3 Data ID expression.
- Accept pex_config overrides for both Pipelines and Tasks.
- Set up and control loggers and diagnostic displays/outputs.
Not all of these components need to go in pipe_base (the Gen3 Data ID parser, at least, absolutely should not), but I'd like them to be usable (and mix-and-match-able) in roughly the same way. It's also worth pointing out that Andy Salnikov already has some command-line parsing code for configs in pipe_supertask right now that is at least heavily derived from what's in pipe_base, and that it may not be possible for that to use the exact same command-line syntax we use for CmdLineTasks today (because there is no single top-level Task/Config in the PipelineTask world). I'd like for this ticket to include refactoring that code as well (while consulting with Andy Salnikov to make sure it still meets his needs and does not disrupt his other work, of course).
A major part of this ticket is extracting information on the kinds of things that need to be command-line configurable from each of the above. Please ask on #gen3-middleware on Slack if you're not sure who the right person to extract any of that information from.
This ticket has not been commented on for 18 months. How much of this is still critical for gen2 deprecation?
What this ticket needs to do is totally driven by what
DM-21898 needs from it. We should certainly consider just writing a new thing that doesn't do anything with Gen2 butlers or data IDSs, and not worry about code duplication with the existing argument parser on the basis that we will just drop it eventually.
Should we close this ticket? In some sense, it might be nice to remove some of the click options that are in ctrl_mpexec down into pipe_base to allow broader usage of things like config overrides but on the other hand people can use ctrl_mpexec.
Yeah, I don't think we have any kind of established guidelines for what goes in pipe_base vs. ctrl_mpexec - ctrl_mpexec now brings in no new DM stack code or (significant) third-party Python packages, and if anything I'd say we should just move it all to pipe_base (while still splitting bare Task off to task_base, with no daf_butler dependency). But even that's a lot of churn for not much gain. In any case, I don't think there's anything sufficiently obviously good to repurpose this ticket to mean, so I'm just closing it as Won't Fix.
There have been some discussions of how these kinds of configuration will work in Gen3 (see e.g. https://confluence.lsstcorp.org/display/DM/SuperTask+Execution+Steps+and+Shared+Components) that haven't really converged mostly because no one has had time to make them their top priorities. For this ticket, whatever general system for argument parsing we come up with absolutely needs to be extensible to whatever that system is, but I'm not ready to try to say what that system is yet.
I'm expecting the QAWG to say something about this. Another area where we need to be extensible but I don't know what we'll do yet.