Discussed this in slack, and then I took a closer look at the code. We do still have work to do, because the selections we do in Gen2 can't just be written as data query expressions in Gen3. In particular:
- AP uses BestSeeingWcsSelectImagesTask, which loads PSF models from calexps and computes sizes from them on the fly. We might eventually start recording that as per-DatasetType metadata in the Registry, and then we might be able to use data query expressions, but not yet (this is in part DM-21774).
- DRP usually uses PsfWcsSelectImagesTask, which computes a rather complex PSF residual metric from src catalogs. It's less obvious that this is something we should put in general-purpose calexp metadata, though that is a possibility. Another option might be to convert this calculation into a true lsst.verify metric Measurement, and (perhaps on
DM-21875) use a special Datatore to put those into the same database that holds the registry, and then add Registry support for including those metric external tables in queries. That's worth doing for its own sake eventually, but it's a lot of work to get coaddition working as it does in Gen2.
More importantly, trying to use data query expressions to solve either of these has a much bigger problem: the information needed to make the selection fundamentally is not available until after calexps are created, but we may want to generate a QuantumGraph that starts from raws, and in that case we'd need to evaluate the data query expression before the fields it uses will exist (unless we base the selection on a different run, which seems both complex and dangerous).
The alternative is to continue to make selection for coaddition something that happens in Python. The SelectImageTasks are (in Gen2) subtasks that nevertheless use the Butler, but this is problematic in Gen3, especially because they don't necessarily all use the same inputs (BestSeeingWcsSelectImagesTask uses only calexp, while PsfWcsSelectImagesTask also uses src). We don't currently have any way for a PipelineTask to pull connections from a subtask, and I'm not sure we should invent one.
Instead, I think it'd be best to make the Gen3 equivalents of these tasks PipelineTasks in their own right that would take whatever inputs they wished and output an ExposureCatalog of selected images to be (perhaps optionally) consumed by AssembleCoadd and perhaps MakeWarpTask.
Finally, it's worth pointing out that the "Wcs" part of these selection tasks - which checks for overlap between the detector-level image and patch - should be unnecessary in Gen3 - the middleware will already perform a logically equivalent operation. The calculation may be subtly different, thought, because the Gen2 version does the overlap in pixel coordinates while the Gen3 version operates in spherical coordinates. Both are checked/refined later when warping, but we may want to retain the Gen2-style check in Gen3 during the middleware transition to make it easier to compare results, and then drop it after the Gen3 tasks are fully validated and the Gen2 ones are retired.