Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-27868

Stop erroneously adjusting the raw WCS in assembleCcdTask

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ip_isr
    • Labels:
      None
    • Story Points:
      8
    • Sprint:
      DRP F20-5 (Oct)
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      Long story short, I believe we are performing an erroneous “adjustment” to the raw WCS in isrTask. The effect is seen in a shifted (from our actual best-guess) center for the reference object loading sky circle. This is clearly not ideal even in the case where a decent fit is found, but it is also leading to terrible WCS solutions and/or outright failures. Since, if true, this is a fairly significant bug(fix), I will post a detailed accounting of trip I went on to come to this realization & conclusion (namely thanks to Jim Bosch's intuition!) in a comment.

        Attachments

          Issue Links

            Activity

            Hide
            lauren Lauren MacArthur added a comment - - edited

            In working on DM-24024, I came across several examples of cases where we were not getting a good WCS fit in single frame measurement (SFM) despite the image looking perfectly fine.  As an example, the PDR2 processing has the following in the logs for HSC-R2 visit 94862 ccd 16:

            .processCcd.calibrate.astromRefObjLoader ({'ccd': 16, 'visit': 94862, 'field': 'SSP_WIDE', 'dateObs': '2017-01-01', 'pointing': 1827, 'filter': 'HSC-R2', 'taiObs': '2017-01-01', 'expTime': 150.0})(loadReferenceObjects.py:943)- Loading reference objects using center (31.172326, +2.555393) and radius 0.12441559659765129 deg
            ...
            processCcd.calibrate.astrometry ({'ccd': 16, 'visit': 94862, 'field': 'SSP_WIDE', 'dateObs': '2017-01-01', 'pointing': 1827, 'filter': 'HSC-R2', 'taiObs': '2017-01-01', 'expTime': 150.0})(astrometry.py:268)- Matched and fit WCS in 3 iterations; found 126 matches with scatter = 4.444 +- 2.130 arcsec
            

            I was alerted to this when looking at the SFM WCS fits compared with the raw and jointcal WCSs. Note the position of the teal outline for ccd 16 here:

            Note also that jointcal recovers...and to a WCS very close the the raw one, indicating the SFM solution is indeed terrible. Looking at the image, there seems no good reason for the SFM astrometry to get things so wrong

            It is also mysterious that some of the CCDs failed altogether (as they look as "should-be-easy" to calibrate as the others).

            Playing around and digging into things [long story omitted to spare the reader the gory details and dead-end first guess detours!], it became clear that the centering of the loadSkyCircle for the reference catalog loading was off. This would have to happen in isrTask as that search skyCircle is derived from the icExp exposure that the astrometry task works with. These are not persisted by default, so I couldn't confirm based on the (now quite old) PDR2 processing, so I reran SFM on this visit and, indeed, the center one gets from icExp does not match that of the raw.

            In [63]: rawWcs.pixelToSky(rawBbox.getCenter())                                                                   
            Out[63]: SpherePoint(31.173018555309092*geom.degrees, 2.5536144340233475*geom.degrees)
            In [64]: icWcs.pixelToSky(icBbox.getCenter())                                                                     
            Out[64]: SpherePoint(31.17232581107402*geom.degrees, 2.555393143156531*geom.degrees)
            

            In a fun twist, it turns out that current master fails on this ccd (and a few others) with:

            processCcd.calibrate.astromRefObjLoader INFO: Loading reference objects using center (31.172326, +2.555393) and radius 0.12441559659765129 deg
            ...
            processCcd.calibrate.astrometry INFO: Matched and fit WCS in 3 iterations; found 131 matches with scatter = 4.539 +- 2.203 arcsec
            processCcd.calibrate.photoCal.match.sourceSelection INFO: Selected 345/1934 sources
            processCcd.calibrate.photoCal.match.referenceSelection INFO: Selected 169/1803 references
            processCcd.calibrate.photoCal.match INFO: Matched 0 from 345/1934 input and 169/1803 reference sources
            processCcd.calibrate.photoCal.reserve INFO: Reserved 0/0 sources
            processCcd FATAL: Failed on dataId={'visit': 94862, 'filter': 'HSC-R2', 'field': 'SSP_WIDE', 'dateObs': '2017-01-01', 'pointing': 1827, 'ccd': 16, 'taiObs': '2017-01-01', 'expTime': 150.0}: RuntimeError: No matches to use for photocal
            

            (as a side note: calling this a fail is probably the right way to go, except that, with no calexp produced, jointcal is prevented from working its magic...)
            The raw and SFM (a.k.a. calexp) wcs outlines for current master look like (no jointcal as I didn't do a full PDR2 reprocessing!):

            so the bad fits from the PDR2 run are now failures.
            As Jim Bosch speculated, there is a step in ip_isr's assembleCcdTask that is "adjusting" the "raw" WCS to account for trimming. flipping, etc. that would need to be done for a header-only based WCS for amp images. However, the "raw" WCS in this case is now (as of DM-20154) actually a best-guess raw WCS accounting for boresight + rotation + cameraGeom (distortion). As such, the WCS attached to the "inExposure" in assembleCcdTask is already the "adjusted"/best-guess WCS and the current call to cameraGeomUtils.prepareWcsData() is actually performing an erroneous adjustment.

            This has an effect downstream in the astrometry task as the center sky coordinate used to define the reference object loading sky circle is now shifted away from our best guess. For the most part, at least in our HSC RC2 processing, the matching and fitting (with the current pixelMargin padding setting) can handle this and a decent WCS fit is found in single frame processing. However, when probing the full PDR2 dataset, many cases were revealed where this resulted in a terrible WCS fit (with scatter on the order of 5 arcsec) and many others where the ccd failed to process (often because, with the bad WCS fit, no matches could be found for the photometry).

            Indeed, if I skip this WCS "adjusting" step, in SFM, all ccds in that visit now pass with good-looking WCSs.

            The associated ccd 16 log entries for the above run are (note the different search center...and much lower scatter in the fit!):

            processCcd.calibrate.astromRefObjLoader INFO: Loading reference objects using center (31.174511, +2.555801) and radius 0.12439255153514307 deg
            processCcd.calibrate.astromRefObjLoader INFO: Loaded 1431 reference objects
            processCcd.calibrate.astrometry.referenceSelector INFO: Selected 1431/1431 references
            processCcd.calibrate.astrometry.matcher INFO: Matched 175 sources
            processCcd.calibrate.astrometry.matcher INFO: Matched 174 sources
            processCcd.calibrate.astrometry.matcher INFO: Matched 174 sources
            processCcd.calibrate.astrometry INFO: Matched and fit WCS in 3 iterations; found 174 matches with scatter = 0.051 +- 0.036 arcsec
            

            I printed the angular distance between the raw and adjusted WCS at the ccd centers and, for HSC, it was of order 8 arcsec.

            To further test the hypothesis here, I tried to recover some iffy DECam fits that were showing similarly large scatter (this was for the crowded fields Ian Sullivan has been working on). Indeed, for the few I tested, I did recover a much better fit with the extra shifting (of order 20 arcsec in that case) turned off.

            Show
            lauren Lauren MacArthur added a comment - - edited In working on DM-24024 , I came across several examples of cases where we were not getting a good WCS fit in single frame measurement (SFM) despite the image looking perfectly fine.  As an example, the PDR2 processing has the following in the logs for HSC-R2 visit 94862 ccd 16: .processCcd.calibrate.astromRefObjLoader ({ 'ccd' : 16 , 'visit' : 94862 , 'field' : 'SSP_WIDE' , 'dateObs' : '2017-01-01' , 'pointing' : 1827 , 'filter' : 'HSC-R2' , 'taiObs' : '2017-01-01' , 'expTime' : 150.0 })(loadReferenceObjects.py: 943 ) - Loading reference objects using center ( 31.172326 , + 2.555393 ) and radius 0.12441559659765129 deg ... processCcd.calibrate.astrometry ({ 'ccd' : 16 , 'visit' : 94862 , 'field' : 'SSP_WIDE' , 'dateObs' : '2017-01-01' , 'pointing' : 1827 , 'filter' : 'HSC-R2' , 'taiObs' : '2017-01-01' , 'expTime' : 150.0 })(astrometry.py: 268 ) - Matched and fit WCS in 3 iterations; found 126 matches with scatter = 4.444 + - 2.130 arcsec I was alerted to this when looking at the SFM WCS fits compared with the raw and jointcal WCSs. Note the position of the teal outline for ccd 16 here: Note also that jointcal recovers...and to a WCS very close the the raw one, indicating the SFM solution is indeed terrible. Looking at the image, there seems no good reason for the SFM astrometry to get things so wrong It is also mysterious that some of the CCDs failed altogether (as they look as "should-be-easy" to calibrate as the others). Playing around and digging into things [long story omitted to spare the reader the gory details and dead-end first guess detours!] , it became clear that the centering of the loadSkyCircle for the reference catalog loading was off. This would have to happen in isrTask as that search skyCircle is derived from the icExp exposure that the astrometry task works with. These are not persisted by default, so I couldn't confirm based on the (now quite old) PDR2 processing, so I reran SFM on this visit and, indeed, the center one gets from icExp does not match that of the raw . In [ 63 ]: rawWcs.pixelToSky(rawBbox.getCenter()) Out[ 63 ]: SpherePoint( 31.173018555309092 * geom.degrees, 2.5536144340233475 * geom.degrees) In [ 64 ]: icWcs.pixelToSky(icBbox.getCenter()) Out[ 64 ]: SpherePoint( 31.17232581107402 * geom.degrees, 2.555393143156531 * geom.degrees) In a fun twist, it turns out that current master fails on this ccd (and a few others) with: processCcd.calibrate.astromRefObjLoader INFO: Loading reference objects using center ( 31.172326 , + 2.555393 ) and radius 0.12441559659765129 deg ... processCcd.calibrate.astrometry INFO: Matched and fit WCS in 3 iterations; found 131 matches with scatter = 4.539 + - 2.203 arcsec processCcd.calibrate.photoCal.match.sourceSelection INFO: Selected 345 / 1934 sources processCcd.calibrate.photoCal.match.referenceSelection INFO: Selected 169 / 1803 references processCcd.calibrate.photoCal.match INFO: Matched 0 from 345 / 1934 input and 169 / 1803 reference sources processCcd.calibrate.photoCal.reserve INFO: Reserved 0 / 0 sources processCcd FATAL: Failed on dataId = { 'visit' : 94862 , 'filter' : 'HSC-R2' , 'field' : 'SSP_WIDE' , 'dateObs' : '2017-01-01' , 'pointing' : 1827 , 'ccd' : 16 , 'taiObs' : '2017-01-01' , 'expTime' : 150.0 }: RuntimeError: No matches to use for photocal (as a side note: calling this a fail is probably the right way to go, except that, with no  calexp produced, jointcal is prevented from working its magic...) The raw and SFM (a.k.a. calexp) wcs outlines for current master look like (no jointcal as I didn't do a full PDR2 reprocessing!): so the bad fits from the PDR2 run are now failures. As Jim Bosch speculated, there is a step in ip_isr 's assembleCcdTask that is "adjusting" the "raw" WCS to account for trimming. flipping, etc. that would need to be done for a header-only based WCS for amp images. However, the "raw" WCS in this case is now (as of DM-20154 ) actually a best-guess raw WCS accounting for boresight + rotation + cameraGeom (distortion) . As such, the WCS attached to the "inExposure" in assembleCcdTask is already the "adjusted"/best-guess WCS and the current call to cameraGeomUtils.prepareWcsData() is actually performing an erroneous adjustment. This has an effect downstream in the astrometry task as the center sky coordinate used to define the reference object loading sky circle is now shifted away from our best guess. For the most part, at least in our HSC RC2 processing, the matching and fitting (with the current pixelMargin padding setting) can handle this and a decent WCS fit is found in single frame processing. However, when probing the full PDR2 dataset, many cases were revealed where this resulted in a terrible WCS fit (with scatter on the order of 5 arcsec) and many others where the ccd failed to process (often because, with the bad WCS fit, no matches could be found for the photometry). Indeed, if I skip this WCS "adjusting" step, in SFM, all ccds in that visit now pass with good-looking WCSs. The associated ccd 16 log entries for the above run are (note the different search center...and much lower scatter in the fit!): processCcd.calibrate.astromRefObjLoader INFO: Loading reference objects using center ( 31.174511 , + 2.555801 ) and radius 0.12439255153514307 deg processCcd.calibrate.astromRefObjLoader INFO: Loaded 1431 reference objects processCcd.calibrate.astrometry.referenceSelector INFO: Selected 1431 / 1431 references processCcd.calibrate.astrometry.matcher INFO: Matched 175 sources processCcd.calibrate.astrometry.matcher INFO: Matched 174 sources processCcd.calibrate.astrometry.matcher INFO: Matched 174 sources processCcd.calibrate.astrometry INFO: Matched and fit WCS in 3 iterations; found 174 matches with scatter = 0.051 + - 0.036 arcsec I printed the angular distance between the raw and adjusted WCS at the ccd centers and, for HSC, it was of order 8 arcsec. To further test the hypothesis here, I tried to recover some iffy DECam fits that were showing similarly large scatter (this was for the crowded fields Ian Sullivan has been working on). Indeed, for the few I tested, I did recover a much better fit with the extra shifting (of order 20 arcsec in that case) turned off.
            Hide
            jbosch Jim Bosch added a comment -

            Thanks for the careful investigation! I've also added Chris Morrison [X] as a watcher.

            I think we've got two options for how to fix this:

            • Add a config option to AssembleCcdTask that disables setting the WCS, and disable it by default for all cameras. We would document that as something that should only be enabled by cameras that cannot use the new from-boresight raw WCS for some reason.
            • Just delete all of the WCS-setting code in AssembleCcdTask, on the basis that it's more likely that someone will accidentally use it (or be confused just by its existence) than actually need it.

            I'm leaning towards the latter, given we can always recover the old code via Git if it turns out we need it. Does that make sense to you?

            In either case, I think the investigation you've done is sufficiently exhaustive that we should feel fine just making this change with no further testing of its implications - it's an obvious bug, and fixing it does not seem to do any unexpected harm. Keeping an eye out for unexpected regressions in the next regularly-scheduled processing runs seems like the appropriate level of caution for that case.

            Show
            jbosch Jim Bosch added a comment - Thanks for the careful investigation! I've also added Chris Morrison [X] as a watcher. I think we've got two options for how to fix this: Add a config option to AssembleCcdTask that disables setting the WCS, and disable it by default for all cameras. We would document that as something that should only be enabled by cameras that cannot use the new from-boresight raw WCS for some reason. Just delete all of the WCS-setting code in AssembleCcdTask, on the basis that it's more likely that someone will accidentally use it (or be confused just by its existence) than actually need it. I'm leaning towards the latter, given we can always recover the old code via Git if it turns out we need it. Does that make sense to you? In either case, I think the investigation you've done is sufficiently exhaustive that we should feel fine just making this change with no further testing of its implications - it's an obvious bug, and fixing it does not seem to do any unexpected harm. Keeping an eye out for unexpected regressions in the next regularly-scheduled processing runs seems like the appropriate level of caution for that case.
            Hide
            lauren Lauren MacArthur added a comment -

            Thanks, Jim! Either option is easy to implement, so if there’s preference for the latter, I’ll go that route (I think John Parejko may be the best person to comment on any conceivable case where the former would be required...but, as you said, we can always add it back if the need arises).

            I’m going to take this one and will try running it on any other camera data I can think of (DC2/imsim & CFHT come to mind) before putting it out for review.

            Show
            lauren Lauren MacArthur added a comment - Thanks, Jim! Either option is easy to implement, so if there’s preference for the latter, I’ll go that route (I think John Parejko may be the best person to comment on any conceivable case where the former would be required...but, as you said, we can always add it back if the need arises). I’m going to take this one and will try running it on any other camera data I can think of (DC2/imsim & CFHT come to mind) before putting it out for review.
            Hide
            Parejkoj John Parejko added a comment -

            I vote just delete the offending code. Less code is better, and I can't think of an obvious reason why we might want it.

            Show
            Parejkoj John Parejko added a comment - I vote just delete the offending code. Less code is better, and I can't think of an obvious reason why we might want it.
            Hide
            lauren Lauren MacArthur added a comment - - edited

            Ok, I've done several more tests and the results have consistently been either better or identical.  This includes testing this branch on CFHT and DC2/imsim data (in addition to the HSC and DECam data already tested).  I think this is as far as I can go in terms of small-scale validation and am now pretty confident this fix is correct, so should get merged.

            For the record, the typical shifts were of order:
            ~8 arcsec = 48 pixels for HSC
            ~19.8 arcsec = 75 pix for DECam
            ~5.9 arcsec = 32 pixels for CFHT
            ~6.4 arcsec = 32 pixels for DC2/imsim

            [Side note: one of the PDR2 CCDs that had a wonky SFM WCS only got slightly better with this fix...but got "good" with a further decrease on the configs {{-c calibrate.astrometry.matcher.maxOffsetPix=200 calibrate.astromRefObjLoader.pixelMargin=200}}, but I will leave discussion of potentially resetting those defaults for another time...and it was CCD 100, an odd-rotation edge ccd whose invalid region due to vignetting is about half the ccd area.]

            Show
            lauren Lauren MacArthur added a comment - - edited Ok, I've done several more tests and the results have consistently been either better or identical.  This includes testing this branch on CFHT and DC2/imsim data (in addition to the HSC and DECam data already tested).  I think this is as far as I can go in terms of small-scale validation and am now pretty confident this fix is correct, so should get merged. For the record, the typical shifts were of order: ~8 arcsec = 48 pixels for HSC ~19.8 arcsec = 75 pix for DECam ~5.9 arcsec = 32 pixels for CFHT ~6.4 arcsec = 32 pixels for DC2/imsim [Side note: one of the PDR2 CCDs that had a wonky SFM WCS only got slightly better with this fix...but got "good" with a further decrease on the configs {{-c calibrate.astrometry.matcher.maxOffsetPix=200 calibrate.astromRefObjLoader.pixelMargin=200}}, but I will leave discussion of potentially resetting those defaults for another time...and it was CCD 100, an odd-rotation edge ccd whose invalid region due to vignetting is about half the ccd area.]
            Hide
            lauren Lauren MacArthur added a comment - - edited

            Jim, when you have a minute, would you mind looking this over?  My original Jenkins run (usual products + ci_hsc) passed on centos-8 & macos, but failed on centos-7.  Kian-Tat Lim diagnosed this as a race condition in obs_lsst (see DM-27883), so I think I can consider that an effective pass. Ah...my second try just passed

            Show
            lauren Lauren MacArthur added a comment - - edited Jim, when you have a minute, would you mind looking this over?  My original Jenkins run  (usual products + ci_hsc ) passed on centos-8 & macos, but failed on centos-7.  Kian-Tat Lim diagnosed this as a race condition in obs_lsst  (see DM-27883 ), so I think I can consider that an effective pass. Ah...my second try just passed
            Hide
            jbosch Jim Bosch added a comment -

            Looks good!

            Could you open a ticket for revisiting the matcher/loader configs, and make it blocked on DM-24024? It seems like whatever maximum offset/padding you find there for Gen3 region-making should be equally applicable to the matcher/loader configs.

            Show
            jbosch Jim Bosch added a comment - Looks good! Could you open a ticket for revisiting the matcher/loader configs, and make it blocked on DM-24024 ? It seems like whatever maximum offset/padding you find there for Gen3 region-making should be equally applicable to the matcher/loader configs.
            Hide
            lauren Lauren MacArthur added a comment -

            Thanks Jim!  I've created DM-27921 for the pixel padding exploration.  Merged and done.

            Show
            lauren Lauren MacArthur added a comment - Thanks Jim!  I've created DM-27921 for the pixel padding exploration.  Merged and done.

              People

              Assignee:
              lauren Lauren MacArthur
              Reporter:
              lauren Lauren MacArthur
              Reviewers:
              Jim Bosch
              Watchers:
              Chris Morrison [X] (Inactive), Eli Rykoff, Ian Sullivan, Jim Bosch, John Parejko, Lauren MacArthur
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.