Show
added a comment - - edited I have compared the gen2 vs. gen3 outputs for " [almost] all the things"** as far as SFM data products go for every single exposure in the RC2. As In DM-29819 and DM-28858 , this includes:
image arrays (image, variance, mask planes)
photoCalib objects
PSFs
WCSs
every column in every row in the source tables
[**at least one exception is the {{srcMatch}} catalogs, which have not been explicitly checked, but the matching itself is effectively checked in the {{src}} catalog check via the {{calib_*}} flags. Another is that I only specifically compared the afwTable {{src}} catalogs, but not the postprocess parquet source tables. I can create another ticket specifically for that if desired.]
In case anyone wants to know exactly what these comparisons comprised, I have attached the (somewhat hacky and not meant for consumption) script I ran (and will also run on the DC2 w_2021_24 runs of DM-30730 & DM-30674 ).
The only outstanding differences are as follows (full logs from my script are in /datasets/hsc/repo/rerun/private/lauren/w22_gen2_vs_gen3/logs :
The gen3 run had 15 instances of failed SFM:
compareSfd_gen2_vs_gen3_RC2_GAMA_G.sh:
No gen3 calexp found for HSC - G 26036 30
No gen3 calexp found for HSC - G 26048 69
compareSfd_gen2_vs_gen3_RC2_GAMA_I.sh:
No gen3 calexp found for HSC - I 1290 17
compareSfd_gen2_vs_gen3_RC2_GAMA_Y.sh:
No gen3 calexp found for HSC - Y 27032 41
compareSfd_gen2_vs_gen3_RC2_VVDS_R.sh:
No gen3 calexp found for HSC - R 34640 69
compareSfd_gen2_vs_gen3_RC2_VVDS_Z.sh:
No gen3 calexp found for HSC - Z 36498 47
compareSfd_gen2_vs_gen3_RC2_COSMOS_G.sh:
No gen3 calexp found for HSC - G 11692 48
No gen3 calexp found for HSC - G 11706 97
compareSfd_gen2_vs_gen3_RC2_COSMOS_Z.sh:
No gen3 calexp found for HSC - Z 17944 27
compareSfd_gen2_vs_gen3_RC2_COSMOS_Y.log:
No gen3 calexp found for HSC - Y 354 95
No gen3 calexp found for HSC - Y 356 92
No gen3 calexp found for HSC - Y 1868 77
No gen3 calexp found for HSC - Y 22662 88
compareSfd_gen2_vs_gen3_RC2_COSMOS_NB0921.log:
No gen3 calexp found for NB0921 23038 55
No gen3 calexp found for NB0921 23602 25
I cross-matched these against Monika Adamow 's list of 17 failed Quanta on DM-30365 and most of these are explained by the HSM issue of DM-30426 (the branches with the fix were setup for the gen2 run, so they don't fail there) except for the following:
The following case did have a calexp, but no src catalog:
compareSfd_gen2_vs_gen3_RC2_VVDS_R.sh:
No gen3 src catalog found for HSC - R 34714 22
The log tells me:
$ more / scratch / madamow / gen3_bps / submit / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / jobs / calibrate / 97965_calibrate_34714_22 . 2366851.err
raise RuntimeError(f "Registry inconsistency while checking for existing outputs:"
RuntimeError: Registry inconsistency while checking for existing outputs: collection = HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z existingRefs = [DatasetRef(DatasetType('srcMatc
hFull ', {band, instrument, detector, physical_filter, visit_system, visit}, Catalog), {instrument: ' HSC ', detector: 22, visit: 34714, ...}, id=16996657, run=' HSC / runs / RC2 / w_2021_2
2 / DM - 30365 / 20210527T164119Z '), DatasetRef(DatasetType(' calexp ', {band, instrument, detector, physical_filter, visit_system, visit}, ExposureF), {instrument: ' HSC', detector: 22 , v
isit: 34714 , ...}, id = 16996663 , run = 'HSC/runs/RC2/w_2021_22/DM-30365/20210527T164119Z' )] missingRefs = [DatasetRef(DatasetType( 'calexpBackground' , {band, instrument, detector, physi
cal_filter, visit_system, visit}, Background), {instrument: 'HSC' , detector: 22 , visit: 34714 , ...}), DatasetRef(DatasetType( 'srcMatch' , {band, instrument, detector, physical_filt
er, visit_system, visit}, Catalog), {instrument: 'HSC' , detector: 22 , visit: 34714 , ...}), DatasetRef(DatasetType( 'src' , {band, instrument, detector, physical_filter, visit_system
, visit}, SourceCatalog), {instrument: 'HSC' , detector: 22 , visit: 34714 , ...}), DatasetRef(DatasetType( 'calibrate_metadata' , {band, instrument, detector, physical_filter, visit_s
ystem, visit}, PropertySet), {instrument: 'HSC' , detector: 22 , visit: 34714 , ...})]
but there's also this log there saying:
$ more / scratch / madamow / gen3_bps / submit / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / jobs / calibrate / 97965_calibrate_34714_22 . 2239756.err
lsst::afw::fits::FitsError: 'cfitsio error: couldn' t create the named file ( 105 ) : Opening file ' / repo / main / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / srcMatch / 20150715 / r / HS
C - R / 34714 / srcMatch_HSC_r_HSC - R_34714_0_27_HSC_runs_RC2_w_2021_22_DM - 30365_20210527T164119Z .fits ' with mode ' w'
cfitsio error stack:
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
failed to create new file (already exists?):
/ repo / main / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / srcMatch / 20150715 / r /
HSC - R / 34714 / srcMatch_HSC_r_HSC - R_34714_0_27_HSC_runs_RC2_w_2021_22_DM - 30365_2021
0527T164119Z .fits
...
raise RuntimeError(f "Failed to serialize dataset {ref} of type {type(inMemoryDataset)} "
RuntimeError: Failed to serialize dataset srcMatch@{instrument: 'HSC' , detector: 22 , visit: 34714 , ...}, sc = Catalog] ( id = 16996706 ) of type < class 'lsst.afw.table.BaseCatalog' > to
location file : / / / repo / main / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / srcMatch / 20150715 / r / HSC - R / 34714 / srcMatch_HSC_r_HSC - R_34714_0_27_HSC_runs_RC2_w_2021_22_DM - 30365_2021052
7T164119Z .fits
Finally, the following didn't turn up as "missing" for me, but was in Monika's lis of failures:
$ / scratch / madamow / gen3_bps / submit / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / jobs / calibrate / 128553_calibrate_26032_52 . 2239777.err
lsst::afw::fits::FitsError: 'cfitsio error: couldn' t create the named file ( 105 ) : Opening file ' / repo / main / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / srcMatchFull / 20150325 /
g / HSC - G / 26032 / srcMatchFull_HSC_g_HSC - G_26032_1_14_HSC_runs_RC2_w_2021_22_DM - 30365_20210527T164119Z .fits ' with mode ' w'
cfitsio error stack:
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
Warning: the following keyword does not conform to the HIERARCH convention
HIERARCH AFW_TABLE_VERSION = 3
failed to create new file (already exists?):
/ repo / main / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / srcMatchFull / 2015032
5 / g / HSC - G / 26032 / srcMatchFull_HSC_g_HSC - G_26032_1_14_HSC_runs_RC2_w_2021_22_DM - 30
365_20210527T164119Z .fits
raise RuntimeError(f "Failed to serialize dataset {ref} of type {type(inMemoryDataset)} "
RuntimeError: Failed to serialize dataset srcMatchFull@{instrument: 'HSC' , detector: 52 , visit: 26032 , ...}, sc = Catalog] ( id = 16996708 ) of type < class 'lsst.afw.table.BaseCatalog' >
to location file : / / / repo / main / HSC / runs / RC2 / w_2021_22 / DM - 30365 / 20210527T164119Z / srcMatchFull / 20150325 / g / HSC - G / 26032 / srcMatchFull_HSC_g_HSC - G_26032_1_14_HSC_runs_RC2_w_2021_22_DM - 3
0365_20210527T164119Z .fits
I suspect those latter two boil down to some "having to reprocess" certain quanta issues?
Beyond that, the offset in the deblend_peakId for each and every src catalog first noted in DM-28858 still persists (I think Jim Bosch thought this was fixed, so it may be significant that it's not, but I don't think this value is used in any downstream processing, so is not likely to cause any issues...except perhaps the rare and unlikely case of someone doing their own analyses which make use of this column...) So, from my perspective...gen2/gen3 parity is essentially achieved for all visit/ccd exposures comprising the RC2 dataset.
I have compared the gen2 vs. gen3 outputs for "[almost] all the things"** as far as SFM data products go for every single exposure in the RC2. As In
DM-29819andDM-28858, this includes:[**at least one exception is the {{srcMatch}} catalogs, which have not been explicitly checked, but the matching itself is effectively checked in the {{src}} catalog check via the {{calib_*}} flags. Another is that I only specifically compared the afwTable {{src}} catalogs, but not the postprocess parquet source tables. I can create another ticket specifically for that if desired.]
In case anyone wants to know exactly what these comparisons comprised, I have attached the (somewhat hacky and not meant for consumption) script I ran (and will also run on the DC2 w_2021_24 runs of
DM-30730&DM-30674).The only outstanding differences are as follows (full logs from my script are in /datasets/hsc/repo/rerun/private/lauren/w22_gen2_vs_gen3/logs:
The gen3 run had 15 instances of failed SFM:
compareSfd_gen2_vs_gen3_RC2_GAMA_G.sh:
compareSfd_gen2_vs_gen3_RC2_GAMA_I.sh:
compareSfd_gen2_vs_gen3_RC2_GAMA_Y.sh:
compareSfd_gen2_vs_gen3_RC2_VVDS_R.sh:
compareSfd_gen2_vs_gen3_RC2_VVDS_Z.sh:
compareSfd_gen2_vs_gen3_RC2_COSMOS_G.sh:
compareSfd_gen2_vs_gen3_RC2_COSMOS_Z.sh:
compareSfd_gen2_vs_gen3_RC2_COSMOS_Y.log:
compareSfd_gen2_vs_gen3_RC2_COSMOS_NB0921.log:
I cross-matched these against Monika Adamow's list of 17 failed Quanta on
DM-30365and most of these are explained by the HSM issue ofDM-30426(the branches with the fix were setup for the gen2 run, so they don't fail there) except for the following:The following case did have a calexp, but no src catalog:
compareSfd_gen2_vs_gen3_RC2_VVDS_R.sh:
The log tells me:
but there's also this log there saying:
cfitsio error stack:
...
Finally, the following didn't turn up as "missing" for me, but was in Monika's lis of failures:
cfitsio error stack:
I suspect those latter two boil down to some "having to reprocess" certain quanta issues?
Beyond that, the offset in the deblend_peakId for each and every src catalog first noted in
DM-28858still persists (I think Jim Bosch thought this was fixed, so it may be significant that it's not, but I don't think this value is used in any downstream processing, so is not likely to cause any issues...except perhaps the rare and unlikely case of someone doing their own analyses which make use of this column...) So, from my perspective...gen2/gen3 parity is essentially achieved for all visit/ccd exposures comprising the RC2 dataset.