Fix Version/s: None
Sprint:DRP S19-6b, DRP F19-1, DRP F19-2, DRP F19-4
Team:Data Release Production
The qa scripts in pipe_analysis were written and tuned based entirely on HSC-SSP data. Now that we are moving to generalizing the scripts to work on any dataset produced by the lsst stack, certain hard-coded values need to be adapted to be appropriate for the dataset of interest. As Eli Rykoff points out in
DM-18635, one such setting that requires adjustment is the minimum flux limit used for computing the basic statistics (mean and stddev) for the various metrics. While it is currently a config parameter (whose default was set for HSC and specified by RHL), it would be even better to set this limit based on a S/N cut (e.g. S/N>100 would also match what validate_drp is using). Once the new setting has been tested and shown to be the better option, make it the default, but allow for an override option in the configs to force a fixed mag cut (i.e the current behavior).
- is blocked by
DM-19828 Investigate oddness in error propagation with meas_mosaic's photoCalib
While I was on the S/N train, I also added some plots that proved very useful in a previous analysis (
DM-17043) showing flux and S/N histograms (for raw and calibrated fluxes) for various sub-selections. Some examples:
Might you be able to give this a look? I'll remind you that since pipe_analysis still lives in lsst-dm, it is not currently officially subject to the strict code standards of lsst, so you can feel free to give the code only a cursory look (i.e. if you are satisfied that all is functioning as advertised from the plot examples), but I will do my best to address any and all comments, so also feel free to nit-pick to your heart's desire.
It all looks like a very good improvement! My primary comment is that it needs to be made clear that the s/n in the configuration is being scaled by the number of visits in the coadds. I think that's why these plots show s/n > 800 for the coadds? And the idea is that this will give a similar threshold as for the visit analysis?
Thanks Eli! The answer is yes to both of your questions above and I have added comments to clarify this in the docstrings. I also addressed your comments on the PR (and added all-but-one of the docstrings requested, providing my justification). Merged and done!
The cut based on S/N has been implemented and tested on various datasets:
HSC, DC2, DECam, and CFHT at the visit level, and HSC & DC2 at the coadd level. The basic functionality is described in the commit message, but I include it here as well:
Two sets of stats are computed based on "high" and "lower" S/N threshold
levels that are both config parameters. By default, these are set to
500 and 100, respectively. These thresholds apply directly for visits,
but for coadds, the threshold gets scaled by square root of the number
of visits. In either case, if too few objects classified as stars exist
with the configured value, the S/N threshold is decreased by 10 until
a sample with N > config.minHighSampleN (default is 20) is achieved.
The threshold values used are printed to the logs via the Stats object.
An "effective" mag corresponding the S/N cut (computed based as the mean
magnitude of the lower 5% of the S/N > signalToNoiseThreshold subsample)
is also printed on the plots.
Examples can be perused at https://lsst-web.ncsa.illinois.edu/~lauren/lauren/DM-19189/ and https://lsst-web.ncsa.illinois.edu/~lauren/DC2/DM-19189/ for DC2. The default is now to select on S/N as described above, but there is still the option to select on mag by setting the config overrides:
Follow the "magThresh" sub directories to see a comparison of using a mag threshold set at < 21.
Here are a few examples:
DC2 (example where S/N threshold got dynamically shifted to the the minimum number of 20 points into the stats):