Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: obs_subaru
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Sprint:DRP F20-3 (Aug)
-
Team:Data Release Production
-
Urgent?:No
Description
In DM-22266, the pipe_analysis scripts are being converted to have the option of reading in the parquet files now being persisted via pipe_tasks's postprocessing.py script. For the coadds, we are reading in the deepCoadd_obj datasets (added on DM-13770), which consist of a single parquet file per patch that is a merge of the deepCoadd_meas, deepCoadd_forced_src and deepCoadd_ref tables for all filters for a given dataset. As such, these all follow the same column naming conventions as those in their afwTable equivalents (i.e. they do not follow the DDPD-ified naming conventions; those are persisted as the objectTable dataset). For the visit level qa analysis in pipe_analysis, we would like to follow a similar pattern, but using the parquet equivalent of the src catalogs. The dataset for these is source (added as part of DM-24062). However, the default in singleFrameDriver.py is to NOT persist these nor the DPDD-ified sourceTable versions. The latter got a config override in obs_subaru, so they do get persisted for HSC processing. While we may eventually move to only using the DPDD-ified tables, in the interim, having the parquet tables with the original column names is desired (especially for maintaining the ability to read in the afwTables for older repos that didn’t get the postprocessing.py step, so have no parquet output).
As such, the doSaveWideSourceTable config in singleFrameDriver.py, which currently defaults to False, will be also be overridden in obs_subaru to True so that these visit-level src-like parquet files will be available for future RC2 processing runs.
It will be for a future decision (and RFC!) whether any of the defaults should be changed in singleFrameDriver.py itself and/or if we move to using the DPDD-ified versions only.
This is just config change right?