Tasks form the reusable building blocks of algorithmic code. While they are often accessed from the command line, by executing a CmdLineTask, they may also be used directly by the Python programmer from within a script or notebook.
Across the stack and its documentation, there are a few different conventions for the names given to methods used as points of entry to the Task logic:
- Sometimes, we use a run method which takes a list of explicit data types;
- Sometimes, we use a run method which takes a Butler dataRef;
- Sometimes, we provide a runDataRef method which takes a dataRef, unpacks it into its constituent parts, and calls run appropriately;
- Sometimes, rather than providing run as the primary point of entry to the task, we use some other name which reflects its functionality (e.g. CharacterizeImageTask.characterize, CalibrateTask.calibrate, etc).
For the convenience of Python programmers, who would like to be able to call Tasks in a uniform way, and to ensure that the arguments to every Task are explicit (rather than an opaque blob), I suggest that we should standardize our approach as follows:
- Every Task (including CmdLineTasks) must provide a run method as their primary point of entry. This must take as explicit arguments everything the Task needs to get its job done (ie, not a dataRef).
- Tasks may provide a runDataRef method which accepts a dataRef as input, extracts the necessary information from that dataRef, and then calls run.
Note that adopting this RFC will involve not just changing our documentation and updating existing Tasks to follow the new convention, but also changing the logic in pipe_base which calls CmdLineTask.run with a dataRef.