Fix Version/s: None
Sprint:AP S21-2 (January), AP S21-3 (February)
Attempting to rerun ap_verify twice in the same location gives the following error:
ValueError: Output run 'ap_verify-output' already exists, but --extend-run was not given.
Reconfigure ap_verify to use current pipetask idioms.
Per discussion on #dm-alert-prod, rerun support is mainly desirable for restarting interrupted pipelines. I'm therefore not going to try to make APDB operations (the last step in the pipeline) fully rerunnable, as pipeline failures generally happen before this step, and ensuring self-consistent handling of the database is hard to do at a high level in both Gen 2 and Gen 3. It looks like this case was not supported in the original Gen 2 ap_verify, either.
Full rerun capabilities under all circumstances are blocked by a bug in pipetask run --clobber-partial-outputs; see
DM-27492 for details.
I've got support for repeating ap_verify runs that fail before they touch the APDB, which I think was the original level of rerunnability. I've also implemented an analogue of
DM-18715 for Gen 3 (the --clean-run flag); by default, Gen 3 processing will reuse old runs, which requires the config(s) to match.
Looks good. Assuming it's clearing Jenkins and running on the current HSC cosmos dataset, go for the merge.
See DMTN-167, as updated by RFC-741, for a discussion of how --output, --output-run, and --extend-run were meant to be used.
DM-18715, should also allow for config changes. Currently this means using --output, possibly with --replace-run. This, in turn, would mean either that ap_verify needs to know whether it's already been run in the same workspace, or that it needs to autonomously set up the same type of chained collection as expected by pipetask run.