Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-27458

FULLCOVARIANCE in PTC task is rejecting more points than it should for some BOT data detectors

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When running (w_45)

      measurePhotonTransferCurve.py /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_New_12606 --rerun plazas/DM-27185 --id detector=183 expIdc ptcFitType=FULLCOVARIANCE doPhotodiode=False maxIterFullFitCovariancesAstier=10 sigmaClipFullFitCovariancesAstier=15 --clobber-config --clobber-version -j 1
      

      many points are discarded:

      Using the EXPAPPROXIMATION model, the data seems fine:

        Attachments

        1. Discard_Debug_120K_S5_C13_1_09Nov20.png
          Discard_Debug_120K_S5_C13_1_09Nov20.png
          90 kB
        2. Discard_Debug_80K_S10_C13_1_09Nov20.png
          Discard_Debug_80K_S10_C13_1_09Nov20.png
          94 kB
        3. Discard_Debug_80K_S20_C13_1_09Nov20.png
          Discard_Debug_80K_S20_C13_1_09Nov20.png
          94 kB
        4. Gain_Differences_GT_5Pct_12Nov20.pdf
          77 kB
        5. Gain_Histograms_12673_NewCode_12Nov20.pdf
          18 kB
        6. Gain_Summary_12673_FullCov_12Nov20.pdf
          50 kB
        7. Gain_Summary_12673_NewCode_12Nov20.pdf
          48 kB
        8. image-2020-11-06-13-51-39-148.png
          image-2020-11-06-13-51-39-148.png
          357 kB
        9. image-2020-11-06-13-53-13-765.png
          image-2020-11-06-13-53-13-765.png
          514 kB
        10. PTC_det183_80K_S10.pdf
          227 kB
        11. PTC_det183_80K_S20.pdf
          224 kB
        12. PTC_det183_FULLCOV_2020NOV10_EXPMASKID.pdf
          228 kB
        13. PTC_det94.pdf
          222 kB
        14. PTC_Eotest_Gains_12673_FullCov_12Nov20.pdf
          62 kB
        15. PTC_Eotest_Gains_12673_NewCode_12Nov20.pdf
          64 kB
        16. Rejection.png
          Rejection.png
          13 kB
        17. screenshot-1.png
          screenshot-1.png
          320 kB
        18. screenshot-2.png
          screenshot-2.png
          440 kB
        19. screenshot-3.png
          screenshot-3.png
          845 kB
        20. screenshot-4.png
          screenshot-4.png
          836 kB

          Issue Links

            Activity

            Hide
            czw Christopher Waters added a comment -

            I'm concerned that the PTC code is becoming increasingly difficult to understand. I'm hopeful that the gen3 rewrite will help somewhat.
            I'd also suggest rebasing the existing commits into a simpler set that clearly define what changes have been made. It might also be good to move the travis/lint change to either the beginning or end of the commit chain so the PTC commits are together.

            Show
            czw Christopher Waters added a comment - I'm concerned that the PTC code is becoming increasingly difficult to understand. I'm hopeful that the gen3 rewrite will help somewhat. I'd also suggest rebasing the existing commits into a simpler set that clearly define what changes have been made. It might also be good to move the travis/lint change to either the beginning or end of the commit chain so the PTC commits are together.
            Hide
            cslage Craig Lage added a comment -

            Chris,  I share your concerns, but I think functionality has to take precedence over simplicity.  There are over 3000 amplifiers in the focal plane, and each one is a little bit different.  I certainly applaud any efforts to simplify the code, but the fact is that it has to return sensible results for the entire focal plane, which it doesn't do yet.  I think we need to keep fixing the things that fail and get it all working before we try to simplify it.

            Show
            cslage Craig Lage added a comment - Chris,  I share your concerns, but I think functionality has to take precedence over simplicity.  There are over 3000 amplifiers in the focal plane, and each one is a little bit different.  I certainly applaud any efforts to simplify the code, but the fact is that it has to return sensible results for the entire focal plane, which it doesn't do yet.  I think we need to keep fixing the things that fail and get it all working before we try to simplify it.
            Hide
            czw Christopher Waters added a comment -

            I think we agree.  Once we have a fully functioning PTC, we'll have a defined set of inputs and a known target output, and can then refactor the code to be more maintainable.

            Show
            czw Christopher Waters added a comment - I think we agree.  Once we have a fully functioning PTC, we'll have a defined set of inputs and a known target output, and can then refactor the code to be more maintainable.
            Hide
            cslage Craig Lage added a comment -

            I was able to get good results on basically the whole focal plane with FULLCOVARIANCE.  All CCDs passed and only 3 amps fell out, the same three that failed with EXPAPPROX, including the two known dead amps.  These results and plots are in /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_FullCov_12673A  . Below is a list of proposed changes.  Some of these I think should go in, but some others we need to discuss:

            (1) Most of the failures were caused by the following error.  We were already testing for np.sum(w) == 0, which is basically the number of good pixels in the flat pair, at this step: https://github.com/lsst/cp_pipe/blob/f4cdebacd2b778bffd6c5ba5ba4fb438b4052ce2/python/lsst/cp/pipe/ptc.py#L683

            But it turns out that if the number of good pixels is small (less than a few hundred or maybe a few thousand), then the FFT routine that calculates the covariances fails because some of the nPix values come out zero.  So I changed the np.sum(w) test to be np.sum(w) < 10000.  This number could perhaps be smaller, but if a flat pair has less than 10,000 good pixels, we probably don't want to include it anyway.  This fixed most of the failures.

            (2) I realized when we put in limits for the EXPAPPROX fit, we put in +/-100 for the 3rd parameter, because I thought this was the noise.  But it turns out this is the noise^2.  Since the noise can be as high as ~40, I changed this to +/-2000.

            (3) One of the CCDs was still failing in the wres routine at this step: https://github.com/lsst/cp_pipe/blob/f4cdebacd2b778bffd6c5ba5ba4fb438b4052ce2/python/lsst/cp/pipe/astierCovPtcFit.py#L528.  The failure was caused by self.maskMu being all NaNs.  I put in a try/except, as follows.  This clearly isn't a long-term fix, but fixed the problem.  We need to understand how it gets to be all NaNs.

            try:
                 maskedWeightedRes = weightedRes[self.maskMu]        
            except:            
                 maskedWeightedRes = weightedRes * 0.0

            (4) Many of the single amp failures were caused by the code just rejecting too many points in the iterative outlier rejection routine.  I think this routine doesn't make sense.  Once it has rejected a point, that point can never be recovered.  What happens is illustrated in the following sketch .  The first iteration, in red, rejects Pt 6 (correctly), but also rejects Pt 4 and Pt 5.  The second iteration is in green, but Pt 4 and Pt 5, which are good fits, have already been rejected and can't be recovered.  So I propose changing it so that points can be recovered if they are good fits in subsequent iterations.  We need to discuss this.

            .  

             

            Show
            cslage Craig Lage added a comment - I was able to get good results on basically the whole focal plane with FULLCOVARIANCE.  All CCDs passed and only 3 amps fell out, the same three that failed with EXPAPPROX, including the two known dead amps.  These results and plots are in /project/shared/BOT/rerun/cslage/PTC_LSSTCAM_FullCov_12673A  . Below is a list of proposed changes.  Some of these I think should go in, but some others we need to discuss: (1) Most of the failures were caused by the following error.  We were already testing for np.sum(w) == 0, which is basically the number of good pixels in the flat pair, at this step: https://github.com/lsst/cp_pipe/blob/f4cdebacd2b778bffd6c5ba5ba4fb438b4052ce2/python/lsst/cp/pipe/ptc.py#L683 But it turns out that if the number of good pixels is small (less than a few hundred or maybe a few thousand), then the FFT routine that calculates the covariances fails because some of the nPix values come out zero.  So I changed the np.sum(w) test to be np.sum(w) < 10000.  This number could perhaps be smaller, but if a flat pair has less than 10,000 good pixels, we probably don't want to include it anyway.  This fixed most of the failures. (2) I realized when we put in limits for the EXPAPPROX fit, we put in +/-100 for the 3rd parameter, because I thought this was the noise.  But it turns out this is the noise^2.  Since the noise can be as high as ~40, I changed this to +/-2000. (3) One of the CCDs was still failing in the wres routine at this step: https://github.com/lsst/cp_pipe/blob/f4cdebacd2b778bffd6c5ba5ba4fb438b4052ce2/python/lsst/cp/pipe/astierCovPtcFit.py#L528.   The failure was caused by self.maskMu being all NaNs.  I put in a try/except, as follows.  This clearly isn't a long-term fix, but fixed the problem.  We need to understand how it gets to be all NaNs. try : maskedWeightedRes = weightedRes[self.maskMu]        except:            maskedWeightedRes = weightedRes * 0.0 (4) Many of the single amp failures were caused by the code just rejecting too many points in the iterative outlier rejection routine.  I think this routine doesn't make sense.  Once it has rejected a point, that point can never be recovered.  What happens is illustrated in the following sketch .  The first iteration, in red, rejects Pt 6 (correctly), but also rejects Pt 4 and Pt 5.  The second iteration is in green, but Pt 4 and Pt 5, which are good fits, have already been rejected and can't be recovered.  So I propose changing it so that points can be recovered if they are good fits in subsequent iterations.  We need to discuss this. .    
            Hide
            cslage Craig Lage added a comment -

            Thinking more about this, we need to understand why we have flat pairs with zero or a small number of good pixels.  I don't think any of the amplifiers are so defective to justify this, so it seems that something must have gone wrong in the defect finding routine.  I'm going to try to dig into this today.

            Show
            cslage Craig Lage added a comment - Thinking more about this, we need to understand why we have flat pairs with zero or a small number of good pixels.  I don't think any of the amplifiers are so defective to justify this, so it seems that something must have gone wrong in the defect finding routine.  I'm going to try to dig into this today.

              People

              Assignee:
              plazas Andrés Alejandro Plazas Malagón
              Reporter:
              plazas Andrés Alejandro Plazas Malagón
              Reviewers:
              Christopher Waters
              Watchers:
              Andrés Alejandro Plazas Malagón, Christopher Waters, Craig Lage
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.