Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Story Points:3
-
Epic Link:
-
Sprint:AP F19-6 (November)
-
Team:Alert Production
Description
Discussing the Gen 3 migration of lsst.ap.pipe.ApPipeTask and lsst.verify.tasks.PpdbMetricTask on #dm-science-pipelines, Jim Bosch and I concluded that the cleanest trigger for running a PpdbMetricTask would be to provide a dataset that indicates that the database contains all information for a particular data ID. I believe this can also simplify the problem of providing Butler-like input to PpdbMetricTask.
Prototype this sytem in a Gen 2 pipeline by doing the following:
- Create a new dataset type (ppdbConfig?), with visit+ccd granularity (matching the units of work of ApPipeTask).
- Make ApPipeTask dump its PPDB config at the end of DB-related processing for a particular data ID. Note that while the DB config is fixed, we still should have a dataset for each data ID so that Gen 3 code could know which units of processing are "DB complete".
- Create a new Task for creating a PPDB from a ppdbConfig, then configure PpdbMetricTask to use it in ap_verify runs.
These steps should test all aspects of the system except for tracking dependencies between pipeline and metric tasks, which is a Gen 3-only feature.
This ticket should also review the dimensionality of lsst.verify.tasks.PpdbMetricConnections. The only extant concrete PpdbMetricTask, lsst.ap.association.metrics.TotalUnassociatedDiaObjectsMetricTask, runs at collection granularity and produces a collection-grained measurement, but finer-grained database metrics are possible. Adding fine-grained inputs, as proposed above, is a good chance to make sure the outputs do everything we want.