Two nearly-complete tickets -
DM-33027 and DM-38498 - will overhaul our code for building QuantumGraphs, and I'd like to deprecate some of the interfaces they obsolete (or nearly obsolete, with the rest of the work to be done while implementing this RFC).
This includes (all symbols in lsst.pipe.base):
- The TaskDef class in its entirety: a sequence of TaskDef objects is currently how we represent a fully-configured, topologically ordered pipeline, and
DM-33027's PipelineGraph will do a much better job of that.
- The Pipeline.toExpandedPipeline and Pipeline._iter_ methods: these return those sequences/iterators of TaskDef. These are effectively being replaced by Pipeline.to_graph, which returns a PipelineGraph.
- The PipelineDatasetTypes and TaskDatasetTypes classes, in their entirety: these are how we currently extract the dataset types used by a pipeline, but they've actually been pretty fundamentally broken since we introduced the concept for storage class overrides during execution (a problem we've been hacking around in the QG generation algorithm ever since), PipelineGraph will replace these as well.
- The BaseConnection.makeDatasetType method: this is now only used by PipelineDatasetTypes and TaskDatasetTypes.
- The pipeTools module: this contains free functions for sorting pipelines topologically, which is also now a PipelineGraph responsibility.
- The GraphBuilder class and graphBuilder module:
DM-38498will introduce a new more extensible QuantumGraphBuilder class with slightly different interface (to support that extensibility), and make the old GraphBuilder delegate to it, but there's no reason to keep GraphBuilder around long after that.
- The QuantumGraph constructor will be modified to take a PipelineGraph instance as a new required argument, and the quanta argument will become a mapping with str (task label) keys instead of TaskDef keys. The QuantumGraph.taskGraph property will be replaced with a QuantumGraph.pipeline_graph property. QuantumGraph currently has a simpler pipeline-graph-like object embedded within it, and it makes sense to use the full PipelineGraph there now once we have it. The findTasksWithInput, findTasksWithOutput, tasksWithDSType, findTaskDefByName, and findTaskDefByLabel methods will also be removed as redundant with functionality provided by PipelineGraph.
- The QuantumGraph getQuantaForTask, getNumberOfQuantaForTask, and getNodesForTask methods will be modified to take a str task label instead of a TaskDef instance.
- The QuantumGraph constructor will also be modified by removing the pruneRefs argument, and the QuantumGraph.pruneGraphFromRefs method will be removed as well. The new QuantumGraphBuilder algorithm will take care of pruning the graph according to adjustQuantum calls as it builds the graph (which is more efficient and much simpler than doing it later), and I believe future use cases for pruning already-built graphs will involve using failed quanta (not failed or nonexistent datasets) as the primary input, so we'll want to rework those interfaces anyway. These deprecations may be deferred until their long-term replacements are more clear.
And in lsst.ctrl.mpexec:
- All QuantumExecutor, SingleQuantumExector, and TaskFactory methods that take TaskDef will be modified to take lsst.pipe.base.pipeline_graph.TaskNode instances instead, with both supported during deprecation, and the keyword argument name changed from taskDef to task_node.
- SeparablePipelineExecutor.make_quantum_graph will need to deprecate and replace its builder argument. I think it's clear that it'd maintain the original intent for this to be able to accept an arbitrary subclass of QuantumGraphBuilder instead, we can't just pass that directly interface, because QuantumGraphBuilder instances are given all of their inputs at construction, so we can't just pass in a fully-constructed instance if we want to keep pipeline as a separate argument. I think it also needs to drop its where argument, as the fact that there is a (single) where expression for the full graph is no longer an assumption of the base class, so it'd be more appropriate to roll this into the replacement for builder as well - and this is just one example of a subclass-dependent constructor argument. So I think the replacement would either be a factory callback that binds any subclass-specific arguments inside it, or a QuantumGraphBuilder type + arbitrary kwargs to pass to its constructor in addition to the base-class arguments.
- SimplePipelineExecutor.from_pipeline will be modified to accept a PipelineGraph as well as a Pipeline for its first argument, and support for Sequence[TaskDef] there will be deprecated.
These interfaces are mostly used internally by other middleware code that will be straightforward to update, but there is some direct usage of some of these in tests and test utility code (which should also be easy to fix, but more visible to non-middleware developers). Outside lsst_distrib I would guess that only the Prompt Processing system is using any of these, and I don't anticipate any real difficulty updating code there, either.