Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24024

Revisit region padding in HSC Gen3 ingest or visit definition

    XMLWordPrintable

    Details

    • Story Points:
      14
    • Epic Link:
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      DM-19731 looked at the difference between raw WCSs and fitted ones to guess how much padding we needed to add to the raw-WCS-generated regions created when defining visits in Gen3.  Since then, we've started using cameraGeom to improve the raw WCSs quite a bit, and may hence be able to lower the amount of padding.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            Thanks the very detailed investigation! I do have a few follow-up questions, and I'll try to answer all of your (possibly implied) questions below. First, there's one code change I'd like to also include on this ticket: changing the 4000 here to whatever we think the new value should be (probably 200, unless I've misunderstood something).

            our worst case of maximum pixel offset between the individual CCD corners from the raw, calexp and jointcal-derived WCSs in all of RC2 was 191.45 pixels

            Looking at the plot, it looks like the raw-jointcal diff is about the same as the raw-calexp difference, right? I think we've learned that the raw-calexp difference is basically irrelevant, except maybe in the case where jointcal failed entirely (I assume that didn't ever happen in PDR2?), because our primary goal is to find places where the raw WCSs are bad.

            The narrow-band is not surprisingly our worst for this comparison (nor that it is for a highly vignetted edge CCD)...but not by all that much.

            I actually am surprised by this, because I don't see why those raw WCSs would be any worse, and I also wouldn't have expected jointcal WCS inaccuracies (or non-catastrophic calexp WCS inaccuracies) anything close to a single pixel. So if there is something systematic here, I guess it's some combination of:

            • a (weird) chromatic term in the distortion?
            • chromatic residuals (e.g. from DCR) in the jointcal WCSs?
            • something about the observing conditions and/or location on the sky correlated with when we use the narrow bands?

            If the narrow-bands aren't significantly worse, then I suppose there's not much of a mystery, and I don't think I care to investigate it even if it's a real mystery. Or have I just missed a more obvious explanation for why this makes sense?

            Indeed, there seems to be a mild increasing trend with airmass, but the magnitude is pretty small (of order 20 pixels from airmass 1.0 to 2.2). Q: do we think this is significant enough to add an accounting for in our distortion model?

            Not anytime soon, and maybe not ever for HSC. That may change when we start using the distortion model as more than just a starting position, but it's sufficiently hard to do that I'd probably look for other solutions first, if we find that we need improvements in the distortion model.

            Q: might a simple adjustment (based on average values per CCD in the above plot) of our focal plane positions (offset_x and offset_y) in camera.py remedy this somewhat? If my speculations ring true, this could further reduce the amount of pixel padding necessary as the shifts seem to be dominating the "real" cases of maximum corner offsets.

            I first have a follow-up question for this: was this an actual systematic shift for each CCD? Or is it just that the scatter was a function of CCD position in the focal plane? Put another way, I assume the "max WCS corners diff" that's your color axis on the last plot has got to be some kind or RMS or absolute-value difference, because the actual differences are a vector field, and you've only got positive values. What happens if we average the vectors per-CCD?

            In any case, feel free to defer this to another ticket; while a coherent offset would indicate that we probably should (and easily could) update the camera geometry, I don't think that's a super high priority (see previous comment about the accuracy of the cameraGeom not being super important right now). There are also potentially some confounding factors in play, involving what we consider the camera boresight location on the focal plane, whether that's a function of some physical configuration parameters (e.g. zenith angle), and where we put the center of the TAN projection that's at least theoretically part of the WCS, too ("theoretical" because I'm not sure we ever have enough information to disentangle that from the distortions).

            Show
            jbosch Jim Bosch added a comment - Thanks the very detailed investigation! I do have a few follow-up questions, and I'll try to answer all of your (possibly implied) questions below. First, there's one code change I'd like to also include on this ticket: changing the 4000 here to whatever we think the new value should be (probably 200, unless I've misunderstood something). our worst case of maximum pixel offset between the individual CCD corners from the raw, calexp and jointcal-derived WCSs in all of RC2 was 191.45 pixels Looking at the plot, it looks like the raw-jointcal diff is about the same as the raw-calexp difference, right? I think we've learned that the raw-calexp difference is basically irrelevant, except maybe in the case where jointcal failed entirely (I assume that didn't ever happen in PDR2?), because our primary goal is to find places where the raw WCSs are bad. The narrow-band is not surprisingly our worst for this comparison (nor that it is for a highly vignetted edge CCD)...but not by all that much. I actually am surprised by this, because I don't see why those raw WCSs would be any worse, and I also wouldn't have expected jointcal WCS inaccuracies (or non-catastrophic calexp WCS inaccuracies) anything close to a single pixel. So if there is something systematic here, I guess it's some combination of: a (weird) chromatic term in the distortion? chromatic residuals (e.g. from DCR) in the jointcal WCSs? something about the observing conditions and/or location on the sky correlated with when we use the narrow bands? If the narrow-bands aren't significantly worse, then I suppose there's not much of a mystery, and I don't think I care to investigate it even if it's a real mystery. Or have I just missed a more obvious explanation for why this makes sense? Indeed, there seems to be a mild increasing trend with airmass, but the magnitude is pretty small (of order 20 pixels from airmass 1.0 to 2.2). Q: do we think this is significant enough to add an accounting for in our distortion model? Not anytime soon, and maybe not ever for HSC. That may change when we start using the distortion model as more than just a starting position, but it's sufficiently hard to do that I'd probably look for other solutions first, if we find that we need improvements in the distortion model. Q: might a simple adjustment (based on average values per CCD in the above plot) of our focal plane positions (offset_x and offset_y) in camera.py remedy this somewhat? If my speculations ring true, this could further reduce the amount of pixel padding necessary as the shifts seem to be dominating the "real" cases of maximum corner offsets. I first have a follow-up question for this: was this an actual systematic shift for each CCD? Or is it just that the scatter was a function of CCD position in the focal plane? Put another way, I assume the "max WCS corners diff" that's your color axis on the last plot has got to be some kind or RMS or absolute-value difference, because the actual differences are a vector field, and you've only got positive values. What happens if we average the vectors per-CCD? In any case, feel free to defer this to another ticket; while a coherent offset would indicate that we probably should (and easily could) update the camera geometry, I don't think that's a super high priority (see previous comment about the accuracy of the cameraGeom not being super important right now). There are also potentially some confounding factors in play, involving what we consider the camera boresight location on the focal plane, whether that's a function of some physical configuration parameters (e.g. zenith angle), and where we put the center of the TAN projection that's at least theoretically part of the WCS, too ("theoretical" because I'm not sure we ever have enough information to disentangle that from the distortions).
            Hide
            lauren Lauren MacArthur added a comment -

            With apologies for taking so long to get to this...I think I have finally settled on "the number" (for now). I'll also try to address your comments/questions above in the following.

            I think we've learned that the raw-calexp difference is basically irrelevant, except maybe in the case where jointcal failed entirely (I assume that didn't ever happen in PDR2?), because our primary goal is to find places where the raw WCSs are bad.

            Ah, great. At first I was considering calexp WCSs too thinking that some processing runs will not include jointcal solutions...but it's true that the latter is closer to truth and we do have it for all of our SSP-HSC runs, so that's what should be used! This simplifies this analysis a lot. So, going forward, I will only quote numbers based on jointcal WCS results.  I have modified the code to to fallback to calexp WCSs if no jointcal outputs exist – which is the case for the DECam data I have also looked at – but this is a per-analysis, as opposed to a per-CCD one. If were "on" jointcal, if the jointcal solution doesn't exist, that CCD is skipped. And I do think that you are correct that no CCDs that passed SFM failed in jointcal (certainly not in RC2) but that statement is based on spot checking...will confirm on DM-28011 (the issue with PDR2 was calexps that should've have been reasonably easy actually failing or just getting really bad SFM fits. Updated matchers and fitters since that run as well as DM-27868 should improve things there quite a bit.)

            Ok, so one tailspin I went through was trying to move from doing this analysis in tract coordinates back to focal plane coordinates...I quickly got lost in the sea of getting back to the distorted FP pixel from and RA/Dec for a given WCS. I think I'm close, but am going to punt as it turns out my motivation for doing this does not apply for the RC2 dataset. The issue is that FP x/y doesn't necessarily map onto tract x/y if, e.g. the camera rotator angle is not constant over all visits. This is certainly the case for the full PDR2, which is why I just went with the maximum absolute value of either x or y. However, for RC2, the ROTANG is constant at 270deg, so the mapping is consistent:

            ROTANG 270:
            +TractX == +FocalPlaneY
            +TractY == -FocalPlaneX
            

            So, in this context, the sign-retained numbers for this analysis divided by tract X and Y are:
            Max Corner Offset (tract pixels)

            Filter tractOffsetX [ccd] [visit] [tract] tractOffsetY [ccd][visit][tract]
            HSC-G -156.41 [ 22] [11698] [9813] -109.07 [ 2] [34464] [9697]
            HSC-R -152.42 [ 22] [34672] [9697] -116.73 [ 3] [34758] [9697]
            HSC-I -150.76 [ 22] [36114] [9697] -116.36 [ 3] [35950] [9697]
            HSC-Z -173.23 [ 22] [36404] [9697] -130.92 [ 3] [17900] [9813]
            HSC-Y -181.65 [ 70] [27034] [9615] -134.89 [ 3] [36818] [9697]
            NB0921 -190.66 [ 22] [23038] [9813] -129.62 [ 3] [25816] [9813]

            So, our maximum absolute offset is 191 pixels. It is also of note that the CCDs showing the maximum x and y offsets is pretty consistent (more below).

            If the narrow-bands aren't significantly worse, then I suppose there's not much of a mystery, and I don't think I care to investigate it even if it's a real mystery. Or have I just missed a more obvious explanation for why this makes sense?

            I think I may have mislead you here. My "not surprisingly" comment was more based on the thought that we may have fewer sources in the narrow-band images so some may have a less-well constrained fit (and this would be way less of an issue for jointcal). And, indeed, the difference I noted was perhaps small enough to may not even be real (let alone something that needs further investigation). You can see the difference with filter in the table above (superficially, it looks like GRI have smaller differences than ZYN921 of order 30 pixels). I have plotted distributions for different CCD groupings and per filter and there are not huge differences (more on this and some example figures below). I will keep this in mind on DM-28011and let you know if any worry-worthy trends emerge.

            I first have a follow-up question for this: was this an actual systematic shift for each CCD? Or is it just that the scatter was a function of CCD position in the focal plane? Put another way, I assume the "max WCS corners diff" that's your color axis on the last plot has got to be some kind or RMS or absolute-value difference, because the actual differences are a vector field, and you've only got positive values. What happens if we average the vectors per-CCD?

            Ok, I have not yet gone all the way to vectors (will do on a subsequent ticket if we choose to pursue this), but I have made plots that retain the directionality. So, in the following plots, each visit gets a point in the CCD bounding box that represents the maximum jointcal - raw corner shift for that CCD (each entry is shifted randomly within the bbox for visibility). The maximum is of the absolute value of the shift, but the sign when plotted is retained (and I'm leaving it up to the eye to assess the RMS...for now!). This plot shows the difference in xTract (== yFocalPlane):

            and this one is for yTract (== -xFocalPlane)

            I see clear gradients in each direction and I'm thinking these may get removed if we shifted the offset_x & offset_y in the camera.py definitions for HSC. I may be overlooking something deeper that could fallout with such a change (e.g. would our distortion coeffs also need re-calibrating? No thank you...at least not the way they are represented now!!) that would make this an undesirable change at present (but I'm in a good position to give it a go if you thinks it's worth a try!) Also note that it was CCD 22 that was most often associated with the max X difference (here at the max -ve X shift) and similarly in the Y offset for CCD 3...it feels like something systematic could be removed.

            The maximum pointing offset (distance in arcsec...haven't done the vector plot yet) for jointcal now looks like:

            This looks much better than the calexp-based, largely because the joincal representation of the WCS is way less "local". I'm not picking up any trends there and the maximum pointing offset is of order 20 arcsec (120 pixels).

            As mentioned above, I have also looked into the offset distributions as a function of various CCD groupings (CORE, INNER ANNULUS, MID ANNULUS, OUTER ANNULUS (omitting most highly vignetted CCDs, MOST VIGNETTED, and OUTER ODD) and separated by filter. As an example, I just show two here for HSC-I:

            and NB0921:

            (plots for the other filters can be found at here).  Ok...so the tail in x could be a result of some of your colour-related speculations above (also seen in ZY filters).  I'll leave it to you if this seems significant-enough to try to "deal" with.

            Ok, so, bottom line. I have submitted a PR on obs_subaru to set the padding to 250 pixels. As indicated in the commit message, initial indications for the full PDR2 dataset point to some slightly higher offsets: the largest seen thus far is ~220 pixels. Thus, to be somewhat conservative, I am suggesting we adopt a padding of 250 pixels for now (to be reconsidered in the context of the full PDR2 analysis being conducted on DM-28011). Let me know what you think!

            [oh, and I included in the PR a commit to add the datasets I created to facilitate this analysis...hopefully this won't be controversial seeing as that file is going to disappear soon(?!), but I'm also happy to leave it out...]

            Show
            lauren Lauren MacArthur added a comment - With apologies for taking so long to get to this...I think I have finally settled on "the number" (for now). I'll also try to address your comments/questions above in the following. I think we've learned that the raw-calexp difference is basically irrelevant, except maybe in the case where jointcal failed entirely (I assume that didn't ever happen in PDR2?), because our primary goal is to find places where the raw WCSs are bad. Ah, great. At first I was considering calexp WCSs too thinking that some processing runs will not include jointcal solutions...but it's true that the latter is closer to truth and we do have it for all of our SSP-HSC runs, so that's what should be used! This simplifies this analysis a lot. So, going forward, I will only quote numbers based on jointcal WCS results.  I have modified the code to to fallback to calexp WCSs if no jointcal outputs exist – which is the case for the DECam data I have also looked at – but this is a per-analysis, as opposed to a per-CCD one. If were "on"  jointcal , if the jointcal solution doesn't exist, that CCD is skipped. And I do think that you are correct that no CCDs that passed SFM failed in jointcal (certainly not in RC2) but that statement is based on spot checking...will confirm on DM-28011 (the issue with PDR2 was calexps that should've have been reasonably easy actually failing or just getting really bad SFM fits. Updated matchers and fitters since that run as well as DM-27868  should improve things there quite a bit.) Ok, so one tailspin I went through was trying to move from doing this analysis in tract coordinates back to focal plane coordinates...I quickly got lost in the sea of getting back to the distorted FP pixel from and RA/Dec for a given WCS. I think I'm close, but am going to punt as it turns out my motivation for doing this does not apply for the RC2 dataset. The issue is that FP x/y doesn't necessarily map onto tract x/y if, e.g. the camera rotator angle is not constant over all visits. This is certainly the case for the full PDR2, which is why I just went with the maximum absolute value of either x or y. However, for RC2, the ROTANG is constant at 270deg, so the mapping is consistent: ROTANG 270 : + TractX = = + FocalPlaneY + TractY = = - FocalPlaneX So, in this context, the sign-retained numbers for this analysis divided by tract X and Y are: Max Corner Offset (tract pixels) Filter tractOffsetX [ccd] [visit] [tract] tractOffsetY [ccd] [visit] [tract] HSC-G -156.41 [ 22] [11698] [9813] -109.07 [ 2] [34464] [9697] HSC-R -152.42 [ 22] [34672] [9697] -116.73 [ 3] [34758] [9697] HSC-I -150.76 [ 22] [36114] [9697] -116.36 [ 3] [35950] [9697] HSC-Z -173.23 [ 22] [36404] [9697] -130.92 [ 3] [17900] [9813] HSC-Y -181.65 [ 70] [27034] [9615] -134.89 [ 3] [36818] [9697] NB0921 -190.66 [ 22] [23038] [9813] -129.62 [ 3] [25816] [9813] So, our maximum absolute offset is 191 pixels. It is also of note that the CCDs showing the maximum x and y offsets is pretty consistent (more below). If the narrow-bands aren't significantly worse, then I suppose there's not much of a mystery, and I don't think I care to investigate it even if it's a real mystery. Or have I just missed a more obvious explanation for why this makes sense? I think I may have mislead you here. My "not surprisingly" comment was more based on the thought that we may have fewer sources in the narrow-band images so some may have a less-well constrained fit (and this would be way less of an issue for jointcal ). And, indeed, the difference I noted was perhaps small enough to may not even be real (let alone something that needs further investigation). You can see the difference with filter in the table above (superficially, it looks like GRI have smaller differences than ZYN921 of order 30 pixels). I have plotted distributions for different CCD groupings and per filter and there are not huge differences (more on this and some example figures below). I will keep this in mind on DM-28011 and let you know if any worry-worthy trends emerge. I first have a follow-up question for this: was this an actual systematic shift for each CCD? Or is it just that the scatter was a function of CCD position in the focal plane? Put another way, I assume the "max WCS corners diff" that's your color axis on the last plot has got to be some kind or RMS or absolute-value difference, because the actual differences are a vector field, and you've only got positive values. What happens if we average the vectors per-CCD? Ok, I have not yet gone all the way to vectors (will do on a subsequent ticket if we choose to pursue this), but I have made plots that retain the directionality. So, in the following plots, each visit gets a point in the CCD bounding box that represents the maximum jointcal - raw corner shift for that CCD (each entry is shifted randomly within the bbox for visibility). The maximum is of the absolute value of the shift, but the sign when plotted is retained (and I'm leaving it up to the eye to assess the RMS...for now!). This plot shows the difference in xTract (== yFocalPlane): and this one is for yTract (== -xFocalPlane) I see clear gradients in each direction and I'm thinking these may get removed if we shifted the offset_x & offset_y in the camera.py  definitions for HSC. I may be overlooking something deeper that could fallout with such a change (e.g. would our distortion coeffs also need re-calibrating? No thank you...at least not the way they are represented now!!) that would make this an undesirable change at present (but I'm in a good position to give it a go if you thinks it's worth a try!) Also note that it was CCD 22 that was most often associated with the max X difference (here at the max -ve X shift) and similarly in the Y offset for CCD 3...it feels like something systematic could be removed. The maximum pointing offset (distance in arcsec...haven't done the vector plot yet) for jointcal now looks like: This looks much better than the calexp -based, largely because the joincal representation of the WCS is way less "local". I'm not picking up any trends there and the maximum pointing offset is of order 20 arcsec (120 pixels). As mentioned above, I have also looked into the offset distributions as a function of various CCD groupings (CORE, INNER ANNULUS, MID ANNULUS, OUTER ANNULUS (omitting most highly vignetted CCDs, MOST VIGNETTED, and OUTER ODD) and separated by filter. As an example, I just show two here for HSC-I: and NB0921: (plots for the other filters can be found at  here ).  Ok...so the tail in x could be a result of some of your colour-related speculations above (also seen in ZY filters).  I'll leave it to you if this seems significant-enough to try to "deal" with. Ok, so, bottom line. I have submitted a PR on obs_subaru to set the padding to 250 pixels. As indicated in the commit message, initial indications for the full PDR2 dataset point to some slightly higher offsets: the largest seen thus far is ~220 pixels. Thus, to be somewhat conservative, I am suggesting we adopt a padding of 250 pixels for now (to be reconsidered in the context of the full PDR2 analysis being conducted on DM-28011 ). Let me know what you think! [oh, and I included in the PR a commit to add the datasets I created to facilitate this analysis...hopefully this won't be controversial seeing as that file is going to disappear soon(?!), but I'm also happy to leave it out...]
            Hide
            lauren Lauren MacArthur added a comment - - edited

            Quick and minor update...I got the "sky-to-distorted-focal-plane-pixel" transformation working (at least I'm reasonably sure it's correct!), so I made the "vectorized" plot of the CCD center shifts (raw vs. jointcal) based on these coords:

            Show
            lauren Lauren MacArthur added a comment - - edited Quick and minor update...I got the "sky-to-distorted-focal-plane-pixel" transformation working (at least I'm reasonably sure it's correct!), so I made the "vectorized" plot of the CCD center shifts (raw vs. jointcal) based on these coords:
            Hide
            jbosch Jim Bosch added a comment -

            Sorry it took me so long to get back to this.  I can't say I've thought deeply enough about all of the plots to identify what does and doesn't make sense (particularly in how much to worry about the discrepancies between the calexp and jointcal differences), but I think there's more than enough here to back the change to the parameter we're interested on this ticket.  I'm more than happy to defer thinking about changing the saved distortion until we've got a natural way to utilize astrometry fits for that.

            No objection from me to adding more pipe_analysis datasets here; that's still the best of the (bad) options, though it's one more reminder that we want to start doing new analyses in the new analysis_drp instead as soon as we can get it to a level of minimum usability.

            Show
            jbosch Jim Bosch added a comment - Sorry it took me so long to get back to this.  I can't say I've thought deeply enough about all of the plots to identify what does and doesn't make sense (particularly in how much to worry about the discrepancies between the calexp and jointcal differences), but I think there's more than enough here to back the change to the parameter we're interested on this ticket.  I'm more than happy to defer thinking about changing the saved distortion until we've got a natural way to utilize astrometry fits for that. No objection from me to adding more pipe_analysis datasets here; that's still the best of the (bad) options, though it's one more reminder that we want to start doing new analyses in the new analysis_drp instead as soon as we can get it to a level of minimum usability.
            Hide
            lauren Lauren MacArthur added a comment -

            Thanks Jim.  Out of an abundance of caution, I kicked off another Jenkins run...which failed in ci_hsc_gen2 for unrelated issues.  Thanks to an expedient fix from Kian-Tat Lim & Tim Jenness, a fresh run passed. Merged & done.

            Show
            lauren Lauren MacArthur added a comment - Thanks Jim.  Out of an abundance of caution, I kicked off another Jenkins run...which failed in ci_hsc_gen2 for unrelated issues.  Thanks to an expedient fix from  Kian-Tat Lim & Tim Jenness , a fresh run passed . Merged & done.

              People

              Assignee:
              lauren Lauren MacArthur
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, John Parejko, Lauren MacArthur, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.