# NaNs in measurePhotonTransferCurve.py causing failures

XMLWordPrintable

#### Details

• Type: Bug
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
3
• Team:
Data Release Production
• Urgent?:
No

#### Description

When running the PTC analysis on BOT run 12606, many amps failed to return valid PTC curves.  I traced this to saturated images in the flat pairs, which caused NaNs in the mean/variance values.  When the code runs the _getInitialGoodPoints routine, the medianRatio parameter becomes NaN, and then all points fail.  I was able to fix this by changing the medianRatio from np.median to np.nanmedian, and then the PTC curves ran OK, but then the plotPtc.py routine failed to plot the PTCs.  As a workaround, I just eliminated the saturated flat pairs from the input deck, but long term the code needs to be robust to saturated inputs.  FWIW, I swear that this problem was not there a few weeks ago.

#### Attachments

1. PTC_det36.pdf
81 kB

#### Activity

Hide
Andrés Alejandro Plazas Malagón added a comment - - edited

I replaced np.median in _getInitialGoodPoints in ptc.py, and, similarly, switched to np.nanmin and np.nanmax to calculate limits in the plotting routine (which is what was causing it to fail in this case).

With this, we still keep the NaNs in the raw vectors. It wasn't happening before because the raw vectors were being filled after the NaNs were discarded.

Commands: (w_2020_41)

 measurePhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun plazas/PTC_LSSTCAM_New_12606/2020OCT14 --id detector=36 expId=3020100800155^3020100800156^3020100800158^3020100800159^3020100800185^3020100800186^3020100800161^3020100800162^3020100800188^3020100800189^3020100800164^3020100800165^3020100800191^3020100800192^3020100800167^3020100800168^3020100800194^3020100800195^3020100800170^3020100800171^3020100800197^3020100800198^3020100800173^3020100800174^3020100800200^3020100800201^3020100800176^3020100800177^3020100800203^3020100800204^3020100800179^3020100800180^3020100800206^3020100800207^3020100800182^3020100800183^3020100800209^3020100800210^3020100800212^3020100800213^3020100800215^3020100800216^3020100800218^3020100800219^3020100800221^3020100800222 -c maxMeanSignal=100000 ptcFitType=EXPAPPROXIMATION doPhotodiode=False sigmaCutPtcOutliers=5.0 initialNonLinearityExclusionThresholdPositive=0.25 --clobber-config --clobber-version -j 1 

 plotPhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606/rerun/plazas/PTC_LSSTCAM_New_12606/2020OCT14 --id detector=36 -c datasetFileName=/project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606/rerun/plazas/PTC_LSSTCAM_New_12606/2020OCT14/calibrations/ptc/ptcDataset-det036.fits --clobber-versions --clobber-config -j 1 

Plots: PTC_det36.pdf

Show
Andrés Alejandro Plazas Malagón added a comment - - edited I replaced np.median in _getInitialGoodPoints in ptc.py , and, similarly, switched to np.nanmin and np.nanmax to calculate limits in the plotting routine (which is what was causing it to fail in this case). With this, we still keep the NaNs in the raw vectors. It wasn't happening before because the raw vectors were being filled after the NaNs were discarded. Commands: ( w_2020_41 ) measurePhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun plazas/PTC_LSSTCAM_New_12606/2020OCT14 --id detector=36 expId=3020100800155^3020100800156^3020100800158^3020100800159^3020100800185^3020100800186^3020100800161^3020100800162^3020100800188^3020100800189^3020100800164^3020100800165^3020100800191^3020100800192^3020100800167^3020100800168^3020100800194^3020100800195^3020100800170^3020100800171^3020100800197^3020100800198^3020100800173^3020100800174^3020100800200^3020100800201^3020100800176^3020100800177^3020100800203^3020100800204^3020100800179^3020100800180^3020100800206^3020100800207^3020100800182^3020100800183^3020100800209^3020100800210^3020100800212^3020100800213^3020100800215^3020100800216^3020100800218^3020100800219^3020100800221^3020100800222 -c maxMeanSignal=100000 ptcFitType=EXPAPPROXIMATION doPhotodiode=False sigmaCutPtcOutliers=5.0 initialNonLinearityExclusionThresholdPositive=0.25 --clobber-config --clobber-version -j 1 plotPhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606/rerun/plazas/PTC_LSSTCAM_New_12606/2020OCT14 --id detector=36 -c datasetFileName=/project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606/rerun/plazas/PTC_LSSTCAM_New_12606/2020OCT14/calibrations/ptc/ptcDataset-det036.fits --clobber-versions --clobber-config -j 1 Plots: PTC_det36.pdf
Hide
Merlin Fisher-Levine added a comment -

Small comments on the docs, but great other than that.

Show
Merlin Fisher-Levine added a comment - Small comments on the docs, but great other than that.
Hide
Craig Lage added a comment -

I really don't understand where these NaNs are coming from.  Eliminating the saturated images removed most of the issue, but there are still afew amps that are returing NaN for no apparent reason.  I whittled it down to this simple command line, which runs very fast:

 measurePhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --id detector=180 expId=3020100800155^3020100800156 -c maxMeanSignal=100000 ptcFitType=EXPAPPROXIMATION initialNonLinearityExclusionThresholdPositive=0.25 doPhotodiode=False --clobber-versions -j

Then I added these print statements in measureMeanVarCov in ptc.py:

  mu1 = afwMath.makeStatistics(im1Area, afwMath.MEANCLIP, im1StatsCtrl).getValue()  mu2 = afwMath.makeStatistics(im2Area, afwMath.MEANCLIP, im2StatsCtrl).getValue()  print("In measureMeanVarCov, amp = %s, expId = %s"%(ampName, exposure1.getInfo().getVisitInfo().getExposureId()))  print("im1Area.image.array min and max:", im1Area.image.array.min(), im1Area.image.array.max())  print("im1Area mean (mu1) as calculated by afwMath.makeStatistics",mu1)  print() 

When I run this, amps C02 and C07 return NaN for the mean, even though the images look fine and I can print out the min and max of the array data.  Other amps look OK. Something with the mask???

 In measureMeanVarCov, amp = C10, expId = 3020100800155180 im1Area.image.array min and max: 36.060616 171.45981 im1Area mean (mu1) as calculated by afwMath.makeStatistics 118.79587169182115   In measureMeanVarCov, amp = C11, expId = 3020100800155180 im1Area.image.array min and max: 35.585648 186.14578 im1Area mean (mu1) as calculated by afwMath.makeStatistics 115.47448062124805   In measureMeanVarCov, amp = C12, expId = 3020100800155180 im1Area.image.array min and max: 35.35383 1805.6526 im1Area mean (mu1) as calculated by afwMath.makeStatistics 116.74028567317016   In measureMeanVarCov, amp = C13, expId = 3020100800155180 im1Area.image.array min and max: 37.272396 202.33514 im1Area mean (mu1) as calculated by afwMath.makeStatistics 115.1495522146488   In measureMeanVarCov, amp = C14, expId = 3020100800155180 im1Area.image.array min and max: 35.201366 168.98125 im1Area mean (mu1) as calculated by afwMath.makeStatistics 114.183253251471   In measureMeanVarCov, amp = C15, expId = 3020100800155180 im1Area.image.array min and max: -12621.5 26990.564 im1Area mean (mu1) as calculated by afwMath.makeStatistics 114.66133031705989   In measureMeanVarCov, amp = C16, expId = 3020100800155180 im1Area.image.array min and max: 37.31251 169.0054 im1Area mean (mu1) as calculated by afwMath.makeStatistics 113.54425356970359   In measureMeanVarCov, amp = C17, expId = 3020100800155180 im1Area.image.array min and max: 35.534927 179.67491 im1Area mean (mu1) as calculated by afwMath.makeStatistics 114.70147974456593   In measureMeanVarCov, amp = C07, expId = 3020100800155180 im1Area.image.array min and max: 107.59257 123.71165 im1Area mean (mu1) as calculated by afwMath.makeStatistics nan   measurePhotonTransferCurve WARN: NaN mean or var, or None cov in amp C07 in exposure pair 3020100800155180, 3020100800156180 of detector 180.   In measureMeanVarCov, amp = C06, expId = 3020100800155180 im1Area.image.array min and max: 42.44537 173.70656 im1Area mean (mu1) as calculated by afwMath.makeStatistics 120.40967085989031   In measureMeanVarCov, amp = C05, expId = 3020100800155180 im1Area.image.array min and max: -1976.951 181.91766 im1Area mean (mu1) as calculated by afwMath.makeStatistics 124.48124170966868   In measureMeanVarCov, amp = C04, expId = 3020100800155180 im1Area.image.array min and max: 43.84583 200.44437 im1Area mean (mu1) as calculated by afwMath.makeStatistics 124.45436966906746   In measureMeanVarCov, amp = C03, expId = 3020100800155180 im1Area.image.array min and max: 39.0246 222.63539 im1Area mean (mu1) as calculated by afwMath.makeStatistics 123.44538765460254   In measureMeanVarCov, amp = C02, expId = 3020100800155180 im1Area.image.array min and max: 75.340385 146.52866 im1Area mean (mu1) as calculated by afwMath.makeStatistics nan   measurePhotonTransferCurve WARN: NaN mean or var, or None cov in amp C02 in exposure pair 3020100800155180, 3020100800156180 of detector 180.   In measureMeanVarCov, amp = C01, expId = 3020100800155180 im1Area.image.array min and max: 5.981539 185.96135 im1Area mean (mu1) as calculated by afwMath.makeStatistics 126.83769476779912   In measureMeanVarCov, amp = C00, expId = 3020100800155180 im1Area.image.array min and max: 47.930855 185.4127 im1Area mean (mu1) as calculated by afwMath.makeStatistics 127.40075437510123  

Show
Craig Lage added a comment - I really don't understand where these NaNs are coming from.  Eliminating the saturated images removed most of the issue, but there are still afew amps that are returing NaN for no apparent reason.  I whittled it down to this simple command line, which runs very fast: measurePhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --id detector= 180 expId= 3020100800155 ^ 3020100800156 -c maxMeanSignal= 100000 ptcFitType=EXPAPPROXIMATION initialNonLinearityExclusionThresholdPositive= 0.25 doPhotodiode=False --clobber-versions -j Then I added these print statements in measureMeanVarCov in ptc.py: mu1 = afwMath.makeStatistics(im1Area, afwMath.MEANCLIP, im1StatsCtrl).getValue() mu2 = afwMath.makeStatistics(im2Area, afwMath.MEANCLIP, im2StatsCtrl).getValue() print( "In measureMeanVarCov, amp = %s, expId = %s" %(ampName, exposure1.getInfo().getVisitInfo().getExposureId())) print( "im1Area.image.array min and max:" , im1Area.image.array.min(), im1Area.image.array.max()) print( "im1Area mean (mu1) as calculated by afwMath.makeStatistics" ,mu1) print() When I run this, amps C02 and C07 return NaN for the mean, even though the images look fine and I can print out the min and max of the array data.  Other amps look OK. Something with the mask??? In measureMeanVarCov, amp = C10, expId = 3020100800155180 im1Area.image.array min and max: 36.060616 171.45981 im1Area mean (mu1) as calculated by afwMath.makeStatistics 118.79587169182115   In measureMeanVarCov, amp = C11, expId = 3020100800155180 im1Area.image.array min and max: 35.585648 186.14578 im1Area mean (mu1) as calculated by afwMath.makeStatistics 115.47448062124805   In measureMeanVarCov, amp = C12, expId = 3020100800155180 im1Area.image.array min and max: 35.35383 1805.6526 im1Area mean (mu1) as calculated by afwMath.makeStatistics 116.74028567317016   In measureMeanVarCov, amp = C13, expId = 3020100800155180 im1Area.image.array min and max: 37.272396 202.33514 im1Area mean (mu1) as calculated by afwMath.makeStatistics 115.1495522146488   In measureMeanVarCov, amp = C14, expId = 3020100800155180 im1Area.image.array min and max: 35.201366 168.98125 im1Area mean (mu1) as calculated by afwMath.makeStatistics 114.183253251471   In measureMeanVarCov, amp = C15, expId = 3020100800155180 im1Area.image.array min and max: - 12621.5 26990.564 im1Area mean (mu1) as calculated by afwMath.makeStatistics 114.66133031705989   In measureMeanVarCov, amp = C16, expId = 3020100800155180 im1Area.image.array min and max: 37.31251 169.0054 im1Area mean (mu1) as calculated by afwMath.makeStatistics 113.54425356970359   In measureMeanVarCov, amp = C17, expId = 3020100800155180 im1Area.image.array min and max: 35.534927 179.67491 im1Area mean (mu1) as calculated by afwMath.makeStatistics 114.70147974456593   In measureMeanVarCov, amp = C07, expId = 3020100800155180 im1Area.image.array min and max: 107.59257 123.71165 im1Area mean (mu1) as calculated by afwMath.makeStatistics nan   measurePhotonTransferCurve WARN: NaN mean or var, or None cov in amp C07 in exposure pair 3020100800155180 , 3020100800156180 of detector 180 .   In measureMeanVarCov, amp = C06, expId = 3020100800155180 im1Area.image.array min and max: 42.44537 173.70656 im1Area mean (mu1) as calculated by afwMath.makeStatistics 120.40967085989031   In measureMeanVarCov, amp = C05, expId = 3020100800155180 im1Area.image.array min and max: - 1976.951 181.91766 im1Area mean (mu1) as calculated by afwMath.makeStatistics 124.48124170966868   In measureMeanVarCov, amp = C04, expId = 3020100800155180 im1Area.image.array min and max: 43.84583 200.44437 im1Area mean (mu1) as calculated by afwMath.makeStatistics 124.45436966906746   In measureMeanVarCov, amp = C03, expId = 3020100800155180 im1Area.image.array min and max: 39.0246 222.63539 im1Area mean (mu1) as calculated by afwMath.makeStatistics 123.44538765460254   In measureMeanVarCov, amp = C02, expId = 3020100800155180 im1Area.image.array min and max: 75.340385 146.52866 im1Area mean (mu1) as calculated by afwMath.makeStatistics nan   measurePhotonTransferCurve WARN: NaN mean or var, or None cov in amp C02 in exposure pair 3020100800155180 , 3020100800156180 of detector 180 .   In measureMeanVarCov, amp = C01, expId = 3020100800155180 im1Area.image.array min and max: 5.981539 185.96135 im1Area mean (mu1) as calculated by afwMath.makeStatistics 126.83769476779912   In measureMeanVarCov, amp = C00, expId = 3020100800155180 im1Area.image.array min and max: 47.930855 185.4127 im1Area mean (mu1) as calculated by afwMath.makeStatistics 127.40075437510123
Hide
Craig Lage added a comment -

Andrés and I think that the reason that these amps are returning NaN is that the defect code has decided to mask out the entire amp for some reason.  Below are the first 15 pixels of row 100 in the mask plane.  All of the amps have a value of 128 for the first 10 pixels - this is the edge masking.  But Amps C02 and C07 have a non-zero value in the interior.  Now the question is why they were masked out.  The images look reasonable, and the EOTest code retruned valid gain values.

 C10 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C11 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C12 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C13 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C14 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C15 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C16 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C17 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C07 [391 391 391 391 391 391 391 391 391 391 263 263 263 263 263] C06 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C05 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C04 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C03 [390 390 390 390 390 128 128 128 128 128 0 0 0 0 0] C02 [391 391 391 391 391 391 391 391 391 391 263 263 263 263 263] C01 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] C00 [128 128 128 128 128 128 128 128 128 128 0 0 0 0 0] 

Show
Craig Lage added a comment - Andrés and I think that the reason that these amps are returning NaN is that the defect code has decided to mask out the entire amp for some reason.  Below are the first 15 pixels of row 100 in the mask plane.  All of the amps have a value of 128 for the first 10 pixels - this is the edge masking.  But Amps C02 and C07 have a non-zero value in the interior.  Now the question is why they were masked out.  The images look reasonable, and the EOTest code retruned valid gain values. C10 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C11 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C12 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C13 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C14 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C15 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C16 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C17 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C07 [ 391 391 391 391 391 391 391 391 391 391 263 263 263 263 263 ] C06 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C05 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C04 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C03 [ 390 390 390 390 390 128 128 128 128 128 0 0 0 0 0 ] C02 [ 391 391 391 391 391 391 391 391 391 391 263 263 263 263 263 ] C01 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ] C00 [ 128 128 128 128 128 128 128 128 128 128 0 0 0 0 0 ]
Hide
Andrés Alejandro Plazas Malagón added a comment -

With the help of Chris, we traced the problem to the fact that amps C02 and C07 have negative saturation levels: https://github.com/lsst/obs_lsst/blob/master/policy/lsstCam/R43.yaml#L64 and that's why they were being masked/declared as bad.

For the moment, isr.doSaturation=False of setting isr.saturation to some high level during ISR would help, but we are consulting (#dm-lsstcam) to see what the proper fix is.

Show
Andrés Alejandro Plazas Malagón added a comment - With the help of Chris, we traced the problem to the fact that amps C02 and C07 have negative saturation levels: https://github.com/lsst/obs_lsst/blob/master/policy/lsstCam/R43.yaml#L64 and that's why they were being masked/declared as bad. For the moment, isr.doSaturation=False of setting isr.saturation to some high level during ISR would help, but we are consulting (#dm-lsstcam) to see what the proper fix is.
Hide
Andrés Alejandro Plazas Malagón added a comment -

I'll set the negative values to zero for now:

 grep "saturation : -" *.yaml R43.yaml: C07 : { gain : 1.348025, readNoise : 6.527312, saturation : -4376471.000000 } R43.yaml: C02 : { gain : 1.367428, readNoise : 6.830328, saturation : -5319002.500000 } R43.yaml: C06 : { gain : 1.396356, readNoise : 6.672390, saturation : -122039.617188 } 

Show
Andrés Alejandro Plazas Malagón added a comment - I'll set the negative values to zero for now: grep "saturation : -" *.yaml R43.yaml: C07 : { gain : 1.348025, readNoise : 6.527312, saturation : -4376471.000000 } R43.yaml: C02 : { gain : 1.367428, readNoise : 6.830328, saturation : -5319002.500000 } R43.yaml: C06 : { gain : 1.396356, readNoise : 6.672390, saturation : -122039.617188 } Discussion in Slack about the topic: https://lsstc.slack.com/archives/CBE964PR8/p1603131198011000?thread_ts=1602885539.009400&cid=CBE964PR8
Hide
Andrés Alejandro Plazas Malagón added a comment -
Show
Andrés Alejandro Plazas Malagón added a comment - https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/32879/pipeline

#### People

Assignee:
Andrés Alejandro Plazas Malagón
Reporter:
Craig Lage
Reviewers:
Merlin Fisher-Levine
Watchers:
Andrés Alejandro Plazas Malagón, Christopher Waters, Craig Lage, Merlin Fisher-Levine