As discussed in
DM-14761, ApPipeTask creates a new copy of an association database whenever a rerun is used, even if a database already exists in one or more parent repositories. This causes partial reprocessing of a data set to return invalid associations because DIAObjects are not handled consistently.
The root cause is that ApPipeTask tries to always treat the association database as if it were part of the (current run's) output repository. This may be consistent with future Butler support for database operations, but is unlikely to be consistent with how ap_pipe will be run in commissioning/operations at NCSA.
Based on the discussion at the July 12 AP group meeting, ApPipeTask's behavior will be changed as follows:
- ApPipeTask will no longer attempt to overwrite its own database config. However, if the config contains the AssociationTask default (a temporary, in-memory database), it will print a warning to the effect that any association results will be lost.
- The ap_pipe documentation will recommend that users always override the database location to a file location that works for them. (This is a deliberate violation of the sensible-defaults rule for configs, but better than not letting it be configured at all.)
- ap_verify will give ApPipeTask a config that places the database in ap_verify's workspace (note: not in the output repository itself, which is also in said workspace). Config overrides in obs packages or datasets will take precedence over this setting.