While trying to figure out pipetask run and a new pipeline simultaneously, I've encountered the following error messages that I believe could be improved:
- If I omit the --butler-config argument, I get sqlite3.OperationalError: no such table: collection. It would be more helpful to warn that no Butler repository has been found/selected.
- If I have a mismatch between the dimensions of a dataset as listed in PipelineTask connections and in the registry, I get ValueError: Supplied dataset type [dataset] inconsistent with registry definition [dataset]. It would help to also mention which task is the problem; the stack trace is all from general-purpose code.
- If I can't account for a particular dataset, I get RuntimeError: Expected exactly one instance of input [dataset]. As above, it would help for debugging to know which task is being processed.
- If I omit the --register-dataset-types argument, I get KeyError: "DatasetType [dataset] could not be found.", suggesting that there is a matching problem with the pipeline. Given that this is an extremely common user error, it would be helpful to suggest that it might be a registration problem instead.
- If I forget to provide an --input when it's mandatory, I get an Expected exactly one instance of input error with a seemingly arbitrary data ID. This message misleadingly suggests the problem is with the dataset, not the collection.
- An invalid storage class raises a bare KeyError. It would be more helpful to report which connection and/or which dataset type caused the error.
- If I pass a --data-query argument that does not match any IDs, I get UnboundLocalError: local variable 'n' referenced before assignment. An empty query should be handled explicitly by the code.
- If I'm running on a repository with an out-of-date schema, I get sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) attempt to write a readonly database. It would be more helpful to say that the schema is out of date.
- Debug log messages need a bit of work: ctrl.mpexec.mpGraphExecutor DEBUG: Executing QuantumIterData(index=16, quantum=<lsst.daf.butler.core.quantum.Quantum object at 0x7faeb68908c0>, taskDef=<lsst.pipe.base.pipeline.TaskDef object at 0x7faeb2e05090>, dependencies=frozenset()) to make them usable.
- If extend-run is given with an output chain that does not exist, we get an IndexError that has no explanation. Instead we should report that extend-run should not be used first time (or use this as an opportunity to turn off extend run and ignore the user).