Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-29819

Compare the data products of the gen2 vs. gen3 w_2021_14 RC2 runs up to Single Frame Processing

    XMLWordPrintable

    Details

    • Story Points:
      5
    • Epic Link:
    • Sprint:
      DRP S21a (Dec Jan), DRP S21b
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      On DM-28858 and the follow-up DM-28936, it was established that the gen2 and gen3 middleware platforms were producing bitwise-identical data products up to the end of Single Frame Processing.  This was based on the ci_hsc_gen2 & ci_hsc_gen3 package outputs and a single full visit run (1228 in COSMOS HSC-I). This ticket is to perform the same comparison on our first full RC2 dataset run on both middleware platforms (performed with w_2021_14).

        Attachments

          Issue Links

            Activity

            Hide
            lauren Lauren MacArthur added a comment -

            Doh...hit a snag with HSC-Y, so can't claim bitwise identical just yet.  See DM-29881 for details (and solution, hopefully!)

            Comparing gen2 vs. gen3 Single Frame Processing for COSMOS 318 HSC-Y
            Comparing gen2 vs. gen3 calexp image/mask/variance arrays and photoCalib's
            ...Sum of diff of Image ArrayDiff 3251694.25
            ...Sum of diff of Variance ArrayDiff -9713581.0
            ...Sum of diff of Mask ArrayDiff 2782252
            ...photoCalib diff: calMean: 9.516e-06 calErr: -9.235e-08 instFluxAtMag0: -6.266e+07

            Show
            lauren Lauren MacArthur added a comment - Doh...hit a snag with HSC-Y, so can't claim bitwise identical just yet.  See DM-29881 for details (and solution, hopefully!) Comparing gen2 vs. gen3 Single Frame Processing for COSMOS 318 HSC-Y Comparing gen2 vs. gen3 calexp image/mask/variance arrays and photoCalib's ...Sum of diff of Image ArrayDiff 3251694.25 ...Sum of diff of Variance ArrayDiff - 9713581.0 ...Sum of diff of Mask ArrayDiff 2782252 ...photoCalib diff: calMean: 9 .516e- 06 calErr: - 9 .235e- 08 instFluxAtMag0: - 6 .266e+ 07
            Hide
            lauren Lauren MacArthur added a comment -

            Also noticed that the NB0921 datasets in the gen3 w14 processing are missing, e.g. with:

            butler query-data-ids /repo/main --collections HSC/runs/RC2/w_2021_14/DM-29528 --datasets "calexp" --where "tract=9813 and detector=49 and instrument='HSC' and skymap='hsc_rings_v1'"
            

            I only see grizy entries for band. Jim Bosch is looking into it (slack discussion here).

            Show
            lauren Lauren MacArthur added a comment - Also noticed that the NB0921 datasets in the gen3 w14 processing are missing, e.g. with: butler query - data - ids / repo / main - - collections HSC / runs / RC2 / w_2021_14 / DM - 29528 - - datasets "calexp" - - where "tract=9813 and detector=49 and instrument='HSC' and skymap='hsc_rings_v1'" I only see grizy entries for band. Jim Bosch is looking into it (slack discussion here ).
            Hide
            lauren Lauren MacArthur added a comment -

            Another snag shielded from the limited ci_hsc datasets.  There are cases in RC2 where slightly different reference catalogs are getting loaded in gen2 vs. gen3, leading to slight differences in the WCS solutions (which then percolates...)  This is related to the issue of DM-28936 which got us most of the way there, but I now see that the change of the visit padding (from 4000 to 250 pixels) on DM-24024 should have been accompanied by an associated change in the pixelMargin config associated with the LoadReferenceObjectsConfig. I think the logic should be that pixelMargin <= computeVisitRegions["single-raw-wcs"].padding. However, as it stands, the default for pixelMargin is 300 and, in cases we were unlucky not to encounter in ci_hsc, this can lead to a smaller loaded region in gen3 because of the smaller visit padded definition (i.e. if a shard edge lies close to the padded visit edge). As an example, here are the trimmed reference catalogs for gen2 vs. gen3:
            and this includes the full loaded catalogs (note that the blue gen3 shard doesn't quite cover all of the red x's of the filtered gen2 calatog, whose full loaded catalog (purple) is very differently distributed to the gen3 version...):

            I have created DM-30030 to put in a fix for this.

            Show
            lauren Lauren MacArthur added a comment - Another snag shielded from the limited ci_hsc datasets.  There are cases in RC2 where slightly different reference catalogs are getting loaded in gen2 vs. gen3, leading to slight differences in the WCS solutions (which then percolates...)  This is related to the issue of DM-28936 which got us most of the way there, but I now see that the change of the visit padding (from 4000 to 250 pixels) on DM-24024 should have been accompanied by an associated change in the pixelMargin config associated with the  LoadReferenceObjectsConfig . I think the logic should be that pixelMargin <= computeVisitRegions ["single-raw-wcs"] .padding . However, as it stands, the default for pixelMargin is 300 and, in cases we were unlucky not to encounter in ci_hsc , this can lead to a smaller loaded region in gen3 because of the smaller visit padded definition (i.e. if a shard edge lies close to the padded visit edge). As an example, here are the trimmed reference catalogs for gen2 vs. gen3: and this includes the full loaded catalogs (note that the blue gen3 shard doesn't quite cover all of the red x's of the filtered gen2 calatog, whose full loaded catalog (purple) is very differently distributed to the gen3 version...): I have created DM-30030 to put in a fix for this.
            Hide
            lauren Lauren MacArthur added a comment - - edited

            With the resolutions of DM-29881 and DM-30030, I'm inclined to call this one done.  I'm also inclined to push the next "up-to-SFP-parity" comparison to the w_2021_22 runs (i.e. skipping over the now "old and problematic" w_2021_18 runs – both gen2 & gen3 had "issues"). Let me know if you agree and I will close this out and make a ticket for the w_2021_22 runs.

            Show
            lauren Lauren MacArthur added a comment - - edited With the resolutions of DM-29881 and DM-30030 , I'm inclined to call this one done.  I'm also inclined to push the next "up-to-SFP-parity" comparison to the w_2021_22 runs (i.e. skipping over the now "old and problematic" w_2021_18 runs – both gen2 & gen3 had "issues"). Let me know if you agree and I will close this out and make a ticket for the w_2021_22 runs.
            Hide
            jbosch Jim Bosch added a comment -

            You resolution proposal works for me.

            Show
            jbosch Jim Bosch added a comment - You resolution proposal works for me.
            Hide
            lauren Lauren MacArthur added a comment -

            Thanks, Jim.  DM-30647 created.

            Show
            lauren Lauren MacArthur added a comment - Thanks, Jim.  DM-30647 created.

              People

              Assignee:
              lauren Lauren MacArthur
              Reporter:
              lauren Lauren MacArthur
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, Lauren MacArthur
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.