Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-25210

Fix psfex regression in w18

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: psfex
    • Labels:
      None
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      DRP metrics regressions in w19 turned out to be the result of multiple issues. After fixing the overscan issue, we noticed the the metrics still did not return to their w14 baseline.

      The lingering problem was isolated to an interaction between psfex and the new conda env. I don't understand exactly why, but stepping through pdb with a w18 and w17 showed that despite giving psfex the same inputs we were getting different answers here:

      > /software/lsstsw/stack_20200504/stack/miniconda3-4.7.12-2deae7a/Linux64/meas_extensions_psfex/19.0.0-
      2-gd82b0d5+3/python/lsst/meas/extensions/psfex/psfexPsfDeterminer.py(385)determinePsf()
      -> psfex.makeit(fields, sets)
      

      And inspection of fields yielded segfaults in w18.

      *To reproduce:*

        singleFrameDriver.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB/ --rerun private/yusra/RC2/w22reg/hybrid --id visit=1208 ccd=89 --cores 1 --config processCcd.isr.doWrite=True
      

      For iter 1 you should see singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars as a success marker.

      [yusra@lsst-dev01 w22reg]$ diff w17_ccd89.log  w18_ccd89.log  | grep PSF
      < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars.
      < singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.23, dimensions=(41, 41); median background=2387.68
      > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 45/70 stars.
      > singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.11, dimensions=(41, 41); median background=2387.68
      < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 64/74 stars.
      < singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.23, dimensions=(41, 41); median background=2387.75
      > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 50/74 stars.
      > singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.11, dimensions=(41, 41); median background=2387.82
      

      For more info see team debugging June 2 2020 here: https://lsstc.slack.com/archives/C4JQP6FRS/p1591113943137300

        Attachments

          Issue Links

            Activity

            yusra Yusra AlSayyad created issue -
            yusra Yusra AlSayyad made changes -
            Field Original Value New Value
            Description DRP metrics regressions in w19 turned out to be the result of multiple issues. After fixing the overscan issue, we noticed the the metrics still did not return to their w14 baseline.

            The lingering problem was isolated to an interaction between {{psfex}} and the new conda env. I don't understand exactly why, but stepping through pdb with a w18 and w17 showed that despite giving psfex the same inputs we were getting different answers here:
            {code}
            > /software/lsstsw/stack_20200504/stack/miniconda3-4.7.12-2deae7a/Linux64/meas_extensions_psfex/19.0.0-
            2-gd82b0d5+3/python/lsst/meas/extensions/psfex/psfexPsfDeterminer.py(385)determinePsf()
            -> psfex.makeit(fields, sets)
            {code}
            And inspection of fields yielded segfaults in w18.

            **To reproduce:**
            {code}
              singleFrameDriver.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB/ --rerun private/yusra/RC2/w22reg/hybrid --id visit=1208 ccd=89 --cores 1 --config processCcd.isr.doWrite=True
            {code}

            For iter 1 you should see {{singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars}} as a success marker.

            {code}
            [yusra@lsst-dev01 w22reg]$ diff w17_ccd89.log w18_ccd89.log | grep PSF
            < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars.
            < singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.23, dimensions=(41, 41); median background=2387.68
            > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 45/70 stars.
            > singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.11, dimensions=(41, 41); median background=2387.68
            < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 64/74 stars.
            < singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.23, dimensions=(41, 41); median background=2387.75
            > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 50/74 stars.
            > singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.11, dimensions=(41, 41); median background=2387.82
            {code}
            yusra Yusra AlSayyad made changes -
            Description DRP metrics regressions in w19 turned out to be the result of multiple issues. After fixing the overscan issue, we noticed the the metrics still did not return to their w14 baseline.

            The lingering problem was isolated to an interaction between {{psfex}} and the new conda env. I don't understand exactly why, but stepping through pdb with a w18 and w17 showed that despite giving psfex the same inputs we were getting different answers here:
            {code}
            > /software/lsstsw/stack_20200504/stack/miniconda3-4.7.12-2deae7a/Linux64/meas_extensions_psfex/19.0.0-
            2-gd82b0d5+3/python/lsst/meas/extensions/psfex/psfexPsfDeterminer.py(385)determinePsf()
            -> psfex.makeit(fields, sets)
            {code}
            And inspection of fields yielded segfaults in w18.

            **To reproduce:**
            {code}
              singleFrameDriver.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB/ --rerun private/yusra/RC2/w22reg/hybrid --id visit=1208 ccd=89 --cores 1 --config processCcd.isr.doWrite=True
            {code}

            For iter 1 you should see {{singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars}} as a success marker.

            {code}
            [yusra@lsst-dev01 w22reg]$ diff w17_ccd89.log w18_ccd89.log | grep PSF
            < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars.
            < singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.23, dimensions=(41, 41); median background=2387.68
            > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 45/70 stars.
            > singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.11, dimensions=(41, 41); median background=2387.68
            < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 64/74 stars.
            < singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.23, dimensions=(41, 41); median background=2387.75
            > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 50/74 stars.
            > singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.11, dimensions=(41, 41); median background=2387.82
            {code}
            DRP metrics regressions in w19 turned out to be the result of multiple issues. After fixing the overscan issue, we noticed the the metrics still did not return to their w14 baseline.

            The lingering problem was isolated to an interaction between {{psfex}} and the new conda env. I don't understand exactly why, but stepping through pdb with a w18 and w17 showed that despite giving psfex the same inputs we were getting different answers here:
            {code}
            > /software/lsstsw/stack_20200504/stack/miniconda3-4.7.12-2deae7a/Linux64/meas_extensions_psfex/19.0.0-
            2-gd82b0d5+3/python/lsst/meas/extensions/psfex/psfexPsfDeterminer.py(385)determinePsf()
            -> psfex.makeit(fields, sets)
            {code}
            And inspection of fields yielded segfaults in w18.

            **To reproduce:**
            {code}
              singleFrameDriver.py /datasets/hsc/repo --calib /datasets/hsc/repo/CALIB/ --rerun private/yusra/RC2/w22reg/hybrid --id visit=1208 ccd=89 --cores 1 --config processCcd.isr.doWrite=True
            {code}

            For iter 1 you should see {{singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars}} as a success marker.

            {code}
            [yusra@lsst-dev01 w22reg]$ diff w17_ccd89.log w18_ccd89.log | grep PSF
            < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 65/70 stars.
            < singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.23, dimensions=(41, 41); median background=2387.68
            > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 45/70 stars.
            > singleFrameDriver.processCcd.charImage INFO: iter 1; PSF sigma=1.11, dimensions=(41, 41); median background=2387.68
            < singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 64/74 stars.
            < singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.23, dimensions=(41, 41); median background=2387.75
            > singleFrameDriver.processCcd.charImage.measurePsf INFO: PSF determination using 50/74 stars.
            > singleFrameDriver.processCcd.charImage INFO: iter 2; PSF sigma=1.11, dimensions=(41, 41); median background=2387.82
            {code}

            For more info see team debugging June 2 2020 here: https://lsstc.slack.com/archives/C4JQP6FRS/p1591113943137300
            ktl Kian-Tat Lim made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            I have changed the psfex build to allow it to auto-discover the conda fftw and gsl (via autoconf and sconsUtils), removing them from the table file. Unlike the psfex builds from w_18 through w_22, this does not try to use the new autoconf files from more recent versions of psfex in GitHub.

            It seems to cure the numerical differences. It also builds under Jenkins https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/32023/pipeline

            Show
            ktl Kian-Tat Lim added a comment - I have changed the psfex build to allow it to auto-discover the conda fftw and gsl (via autoconf and sconsUtils), removing them from the table file. Unlike the psfex builds from w_18 through w_22, this does not try to use the new autoconf files from more recent versions of psfex in GitHub. It seems to cure the numerical differences. It also builds under Jenkins https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/32023/pipeline
            Hide
            ktl Kian-Tat Lim added a comment -

            I haven't figured out why the newer config files don't work, but this seems to work OK.

            Show
            ktl Kian-Tat Lim added a comment - I haven't figured out why the newer config files don't work, but this seems to work OK.
            ktl Kian-Tat Lim made changes -
            Reviewers Tim Jenness [ tjenness ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            This should block the release candidate.

            Show
            ktl Kian-Tat Lim added a comment - This should block the release candidate.
            ktl Kian-Tat Lim made changes -
            Link This issue blocks DM-24477 [ DM-24477 ]
            Hide
            tjenness Tim Jenness added a comment -

            Looks okay (I assume it actually works).

            Show
            tjenness Tim Jenness added a comment - Looks okay (I assume it actually works).
            tjenness Tim Jenness made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            ktl Kian-Tat Lim made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]
            ktl Kian-Tat Lim made changes -
            Link This issue blocks DM-20564 [ DM-20564 ]
            ktl Kian-Tat Lim made changes -
            Link This issue blocks DM-24477 [ DM-24477 ]
            swinbank John Swinbank made changes -
            Component/s psfex [ 12876 ]

              People

              Assignee:
              ktl Kian-Tat Lim
              Reporter:
              yusra Yusra AlSayyad
              Reviewers:
              Tim Jenness
              Watchers:
              John Swinbank, Kian-Tat Lim, Tim Jenness, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: