I have rerun my scripts looking for parity between the gen3 & gen2 SFM outputs using this new run (/datasets/DC2/repoRun2.2i/rerun/w_2021_25/DM-30812). We are SOOOOO CLOSE, but I have finally encountered some examples of the case of incomplete reference catalog loading due to the 0 padding for the visit definition of this repo (see, e.g. DM-30030 and this community post for details). To illustrate, the following shows the full loaded reference sample (silver circles), selected (i.e. trimmed and passing the reference source selector criterion) reference sample (orange x's) and sources actually used in the astrometric fit (stars) for a given case (visit 193888, detector=126):
Gen3:

and a zoom in:

So, you can see that this detector lines up pretty closely with an edge of this shard and ends up missing out on some of the reference sources that would (should) be included with the 250 pixel padding to the raw WCS when doing the ref cat trimming. The following is the gen2 version:

Note that, for gen2, the selected ref sample had 283 objects, whereas gen3 had only 268. Even so, the source matches that got included in the astrometric fit is actually identical in both cases, so the astrometry is only just barely affected here (but my parity testing is sensitive enough to pick this up). Given that I'm seeing 4 cases of this in just the DC2 dataset (and only a very incomplete one at that as I can only compare the detectors that actually got ingested into the /repo/dc2 repo), this situation is perhaps less rare than we had anticipated/hoped, so updating the visit definition is certainly something to consider (although the partial ingest issues are definitely more urgent...and resolving that will likely result in the visit definition update by default?!)
All four cases here only just barely affect the SFM WCS, so I would have comfortably gone on to the coadd parity comparisons for DC2...but this is not feasible in our current situation of very different visit/detector inputs from gen2 & gen3 repos.
The "good" news is that, as of w_2021_25 and the updated BF kernels for the gen2 repo (DM-30738), and modulo the above and the pesky (but likely insignificant) deblend_peakId offsets, we now seem to be at gen2 vs. gen3 parity for all visit/detector combos of that DC2 dataset that have in common in both the gen2 & gen3 repos.
I've run the script attached to
DM-30647on the gen2 vs. gen3 w_2021_24 runs. Differences are as follows:There are huge differences in the number and subset of {{calexp}}s produced by the two runs which stems from at least two causes:
DM-30747)DM-30426is present for both runs, but is particularly bad for gen2 as it brings any singleFrameDriver job to a halt, so there are many, many visit/detectors missing from that rerun. I will kick off a run on w_2021_25 (i.e. after the fix went in) to make sure no "other" SFM failures occur in gen2 that do not happen in gen3For the visit/detector combos with successful processing in both runs:
Each and every catalog had differences between gen2 & gen3 of the following type:
Single Frame Processing for DC2 174534 i log:
And many also had:
(I do realize the absolute metric here is not the most useful...but there was some other column value for which it was and I failed to adapt based on column as I loop through them all. I'll change this on future runs, but these take days to run, so I'll just leave these as revealing "a difference" between gen2 & gen3 measurements.)
Other than that, everything looks identical.
So one question is: do we care about the parent and id differences?
Another is how to go about getting to the bottom of the gen2 vs. gen3 shapeHSM differences (paging Joshua Meyers on this one!)?