Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-20024

BackgroundList.readFits doesn't close fits files

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Templates:
    • Story Points:
      3
    • Sprint:
      DRP S19-6b
    • Team:
      Data Release Production

      Description

      Both Hiroyuki Ikeda and I have encountered some difficult to reproduce errors in the background application stage of coaddDriver.py which looks like:

      [31] Traceback (most recent call last):
      [31]   File "/ana/products.7.4/stack/miniconda3-4.5.12-1172c30/Linux64/ctrl_pool/7.0-hsc/python/lsst/ctrl/pool/parallel.py", line 509, in logOperation
      [31]     yield
      [31]   File "/ana/products.7.4/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_drivers/7.4-hsc/python/lsst/pipe/drivers/coaddDriver.py", line 262, in warp
      [31]     self.makeCoaddTempExp.runDataRef(patchRef, selectDataList)
      [31]   File "/ana/products.7.4/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_base/7.0-hsc/python/lsst/pipe/base/timer.py", line 150, in wrapper
      [31]     res = func(self, *args, **keyArgs)
      [31]   File "/ana/products.7.4/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/7.0-hsc/python/lsst/pipe/tasks/makeCoaddTempExp.py", line 345, in runDataRef
      [31]     self.applySkyCorr(calExpRef, calExp)
      [31]   File "/ana/products.7.4/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/7.0-hsc/python/lsst/pipe/tasks/makeCoaddTempExp.py", line 555, in applySkyCorr
      [31]     calexp -= bg.getImage()
      [31] TypeError: __isub__(): incompatible function arguments. The following argument types are supported:
      [31]     1. (self: lsst.afw.image.maskedImage.maskedImage.MaskedImageF, arg0: float) -> lsst.afw.image.maskedImage.maskedImage.MaskedImageF
      [31]     2. (self: lsst.afw.image.maskedImage.maskedImage.MaskedImageF, arg0: lsst.afw.image.maskedImage.maskedImage.MaskedImageF) -> lsst.afw.image.maskedImage.maskedImage.MaskedImageF
      [31]     3. (self: lsst.afw.image.maskedImage.maskedImage.MaskedImageF, arg0: lsst.afw.image.image.image.ImageF) -> lsst.afw.image.maskedImage.maskedImage.MaskedImageF
      [31]     4. (self: lsst.afw.image.maskedImage.maskedImage.MaskedImageF, arg0: lsst::afw::math::Function2<double>) -> lsst.afw.image.maskedImage.maskedImage.MaskedImageF
      [31]
      

      When I'd isolate the patch that failed and reran it, it would then infuriatingly succeed. So at first I thought these were transient GPFS errors, but it only appears when reading backgrounds.

      Jim Bosch pointed me to the line that eats the Fits error: https://github.com/lsst/afw/blob/master/python/lsst/afw/math/backgroundList.py#L185

      Setting a loop to read background files and re-raising the FitsError eventually yielded:

      > /home/yusra/lsst_devel/LSST/DMS/afw/python/lsst/afw/math/backgroundList.py(191)readFits()
      -> break
      (Pdb) e
      FitsError('cfitsio error: attempt to open too many files (103) : Opening file '/datasets/hsc/repo/rerun/DM-13666/WIDE/01052/HSC-G/corr/BKGD-0011602-073.fits' with mode 'r'
      cfitsio error stack:
        failed to find or open the following file: (ffopen)
        /datasets/hsc/repo/rerun/DM-13666/WIDE/01052/HSC-G/corr/BKGD-0011602-073.fits
      ')
      

      Bingo.

      BackgroundList needs to close its fits files after reading and constructing the BackgroundList.

      (SPs include not only time to fix but time the time scratching my head today and during the deblender sprint)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yusra Yusra AlSayyad
                Reporter:
                yusra Yusra AlSayyad
                Reviewers:
                Yusra AlSayyad
                Watchers:
                Jim Bosch, Yusra AlSayyad
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel