Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-15117

processCcd passed in w_2018_26 but failed in w_2018_27 for HSC visit=36234 ccd=24

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      1
    • Sprint:
      DRP F18-3
    • Team:
      Data Release Production

      Description

      To reproduce, run:

      processCcd.py /datasets/hsc/repo/  --rerun private/user/name --id visit=36234 ccd=24
      

      It finished successfully using w_2018_26, but failed using w_2018_27 or w_2018_28 with the following:

      processCcd FATAL: Failed on dataId={'visit': 36234, 'ccd': 24, 'field': 'SSP_WIDE', 'dateObs': '2015-07-21', 'pointing': 1297, 'filter': 'HSC-I', 'taiObs': '2015-07-21', 'expTime': 200.0}: RuntimeError: Unable to measure aperture correction for required algorithm 'modelfit_CModel_exp': only 1 sources, but require at least 2.
      Traceback (most recent call last):
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/16.0-2-g852da13+6/python/lsst/pipe/base/cmdLineTask.py", line 392, in __call__
          result = task.run(dataRef, **kwargs)
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/16.0-2-g852da13+6/python/lsst/pipe/base/timer.py", line 150, in wrapper
          res = func(self, *args, **keyArgs)
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0-4-g08dccf71+3/python/lsst/pipe/tasks/processCcd.py", line 188, in run
          doUnpersist=False,
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/16.0-2-g852da13+6/python/lsst/pipe/base/timer.py", line 150, in wrapper
          res = func(self, *args, **keyArgs)
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0-4-g08dccf71+3/python/lsst/pipe/tasks/characterizeImage.py", line 349, in run
          background=background,
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/16.0-2-g852da13+6/python/lsst/pipe/base/timer.py", line 150, in wrapper
          res = func(self, *args, **keyArgs)
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_tasks/16.0-4-g08dccf71+3/python/lsst/pipe/tasks/characterizeImage.py", line 428, in characterize
          apCorrMap = self.measureApCorr.run(exposure=dmeRes.exposure, catalog=dmeRes.sourceCat).apCorrMap
        File "/ssd/lsstsw/stack3_20171021/stack/miniconda3-4.3.21-10a4fa6/Linux64/meas_algorithms/16.0-6-g2dd73041+3/python/lsst/meas/algorithms/measureApCorr.py", line 245, in run
          (name, len(subset2), self.config.minDegreesOfFreedom+1))
      RuntimeError: Unable to measure aperture correction for required algorithm 'modelfit_CModel_exp': only 1 sources, but require at least 2.
      

        Attachments

          Issue Links

            Activity

            Hide
            hchiang2 Hsin-Fang Chiang added a comment -

            On Slack, Jim Bosch mentioned DM-14172 did changes in CModel. But then we realized DM-14172 was in w_2018_26 already. Jim also noted that "DM-15023 might have done something, but should have only have made changes within round-off error (so could only be in play if this image was already very marginal)."

            Show
            hchiang2 Hsin-Fang Chiang added a comment - On Slack, Jim Bosch mentioned DM-14172 did changes in CModel. But then we realized DM-14172 was in w_2018_26 already. Jim also noted that " DM-15023 might have done something, but should have only have made changes within round-off error (so could only be in play if this image was already very marginal)."
            Hide
            swinbank John Swinbank added a comment -

            Jim Bosch, Yusra AlSayyad — do you want to schedule somebody to take a look into this?

            Show
            swinbank John Swinbank added a comment - Jim Bosch , Yusra AlSayyad — do you want to schedule somebody to take a look into this?
            Hide
            jbosch Jim Bosch added a comment -

            Let's find a pair of victims during pair coding this Thursday.

            Show
            jbosch Jim Bosch added a comment - Let's find a pair of victims during pair coding this Thursday.
            Hide
            dtaranu Dan Taranu added a comment -

            I ran processCCD.py as above with w_2018_26 and w_2018_27, setting a PDB trace in measureApCorr.py after apCorrMap = ApCorrMap() but before the loop to correct apertures, then a breakpoint in the loop on the failing model: break 236, name == "modelfit_CModel_exp". As Jim suspected, these were already marginal measurements as all but two of the sources ([3, 9]) have nan flux. The only difference is that in w_2018_27, source 9 has modelfit_CModel_exp_flag_maxIter: 1
            set from modelfit_CModel_exp_nIter: 267 (vs nIter: 38 in w_2018_26), leaving only one good source and triggering the message above.

            Show
            dtaranu Dan Taranu added a comment - I ran processCCD.py as above with w_2018_26 and w_2018_27, setting a PDB trace in measureApCorr.py after apCorrMap = ApCorrMap() but before the loop to correct apertures, then a breakpoint in the loop on the failing model: break 236, name == "modelfit_CModel_exp". As Jim suspected, these were already marginal measurements as all but two of the sources ( [3, 9] ) have nan flux. The only difference is that in w_2018_27, source 9 has modelfit_CModel_exp_flag_maxIter: 1 set from modelfit_CModel_exp_nIter: 267 (vs nIter: 38 in w_2018_26), leaving only one good source and triggering the message above.
            Hide
            yusra Yusra AlSayyad added a comment -
            Show
            yusra Yusra AlSayyad added a comment - Do you know which ticket triggered it?  https://lsst-web.ncsa.illinois.edu/~swinbank/changelog_weekly.html
            Hide
            dtaranu Dan Taranu added a comment - - edited

            edit: DM-15023 and DM-14305 are the likely culprits as there are no differences in the output between 26 and 27 up to that point (to the precision in the logs).

            Here's an image of the field. It has two bright stars, but it doesn't look bad enough that CModel should have failed on all but two sources. That's a separate issue, though. For what it's worth, the actual measurements of the source with its flag set in 27 but not 26 (#9) changed by < 0.1%.

            Show
            dtaranu Dan Taranu added a comment - - edited edit: DM-15023 and DM-14305 are the likely culprits as there are no differences in the output between 26 and 27 up to that point (to the precision in the logs). Here's an image of the field. It has two bright stars, but it doesn't look bad enough that CModel should have failed on all but two sources. That's a separate issue, though. For what it's worth, the actual measurements of the source with its flag set in 27 but not 26 (#9) changed by < 0.1%.

              People

              Assignee:
              dtaranu Dan Taranu
              Reporter:
              hchiang2 Hsin-Fang Chiang
              Watchers:
              Dan Taranu, Hsin-Fang Chiang, Jim Bosch, John Swinbank, Lauren MacArthur, Yusra AlSayyad
              Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.