Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-23630

fgcmcal failure Ubuntu

    XMLWordPrintable

Details

    • 1
    • No

    Description

      fgcmcal fails on master with the latest conda environment on Ubuntu 18.04.4. The failure is in one of the new tests comparing the `photoCalib.getCalibrationErr()` with the `fgcmZptGrayErr`:

      tests/test_fgcmcal_hsc.py:154: 
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      tests/fgcmcalTestBase.py:407: in _testFgcmOutputProducts
          (np.log(10.0)/2.5)*testCal.getCalibrationMean()*fgcmZptGrayErr)
      ../../stack/Linux64/utils/19.0.0-7-g686a884+2/python/lsst/utils/tests.py:656: in assertFloatsAlmostEqual
          testCase.assertFalse(failed, msg="\n".join(errMsg))
      E   AssertionError: True is not false : 1/1 elements differ with rtol=2.220446049250313e-16, atol=2.220446049250313e-16
      E   0.00223844323512219 != 0.0022384434 (diff=2.3283064e-10/0.0022384434=1.0401453e-07)
      

      We are not seeing this on other machines (CentOS, macOS), so there's probably a "fun" interaction with system libraries going on here. Adding rtol=2e-7, atol=None to the above test prevents the error, but may not be the actual solution we want. This suggests to me that there's a float32 calculation going on under the hood somewhere.

      I can provide an Ubuntu account to test on.

      Attachments

        Issue Links

          Activity

            No builds found.
            Parejkoj John Parejko created issue -
            erykoff Eli Rykoff made changes -
            Field Original Value New Value
            Status To Do [ 10001 ] In Progress [ 3 ]
            erykoff Eli Rykoff added a comment -

            I've updated the persisted zeropoint table to all float64 which appears to fix the problem. (These values were stored internally as float64, and for some reason weren't persisted with the butler as float64).

            erykoff Eli Rykoff added a comment - I've updated the persisted zeropoint table to all float64 which appears to fix the problem. (These values were stored internally as float64, and for some reason weren't persisted with the butler as float64).
            erykoff Eli Rykoff made changes -
            Reviewers John Parejko [ parejkoj ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Parejkoj John Parejko added a comment -

            Thanks for the quick fix. The tests passed on my Ubuntu 18.04 machine.

            Parejkoj John Parejko added a comment - Thanks for the quick fix. The tests passed on my Ubuntu 18.04 machine.
            Parejkoj John Parejko made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            erykoff Eli Rykoff added a comment -

            And jenkins passed in 23 minutes. I guess this is what happens when there aren't any merges to master since the last weekly? https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/31283/pipeline

            erykoff Eli Rykoff added a comment - And jenkins passed in 23 minutes. I guess this is what happens when there aren't any merges to master since the last weekly? https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/31283/pipeline
            erykoff Eli Rykoff made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]
            Parejkoj John Parejko made changes -
            Link This issue blocks DM-22655 [ DM-22655 ]

            People

              erykoff Eli Rykoff
              Parejkoj John Parejko
              John Parejko
              Eli Rykoff, John Parejko, John Swinbank, Krzysztof Findeisen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.