Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-27883

obs_lsst has a race condition between tests and curated calibration ingestion

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: obs_lsst
    • Labels:
      None
    • Story Points:
      1
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      pytest as run by the tests SConscript attempts to scan all files, but that set can be in flux as the per-camera ingestCuratedCalibs.py executions can be running at the same time. There are no dependencies between the two.

      I'm not sure if the correct solution is to sequence one before the other or to exclude those directories from pytest.

        Attachments

          Activity

          Hide
          rhl Robert Lupton added a comment -

          Christopher Waters Are these your tests?  Thoughts?

           

          Show
          rhl Robert Lupton added a comment - Christopher Waters Are these your tests?  Thoughts?  
          Hide
          czw Christopher Waters added a comment -

          I don't believe so.  My only thought is that if new calibrations were added, that might have increased the execution time long enough to start this being a problem, but I don't recall anything new being added in the past month.

          Show
          czw Christopher Waters added a comment - I don't believe so.  My only thought is that if new calibrations were added, that might have increased the execution time long enough to start this being a problem, but I don't recall anything new being added in the past month.
          Hide
          ktl Kian-Tat Lim added a comment - - edited

          ingestCuratedCalibs.py is part of the build (SConscripts in the camera directories), not the tests.

          Show
          ktl Kian-Tat Lim added a comment - - edited ingestCuratedCalibs.py is part of the build (SConscripts in the camera directories), not the tests.
          Hide
          tjenness Tim Jenness added a comment -

          This comment:

          # Note the ordering here is critical. LATISS is put at the end here to ensure
          # that the tests are run first and version.py is created, because creation of
          # of the defect registry required the camera to be instantiated.
          # If other cameras add defect generation they should add their build to
          # the end of this list, along with LATISS
          

          in the SConstruct file implies that these targets explicitly run after tests (they do also hard code a dependency on the python target) suggesting that the answer is to change the scanning code. Can someone point to an actual error report from this problem? Was there a Jenkins failure I can look at?

          Show
          tjenness Tim Jenness added a comment - This comment: # Note the ordering here is critical. LATISS is put at the end here to ensure # that the tests are run first and version.py is created, because creation of # of the defect registry required the camera to be instantiated. # If other cameras add defect generation they should add their build to # the end of this list, along with LATISS in the SConstruct file implies that these targets explicitly run after tests (they do also hard code a dependency on the python target) suggesting that the answer is to change the scanning code. Can someone point to an actual error report from this problem? Was there a Jenkins failure I can look at?
          Hide
          ktl Kian-Tat Lim added a comment -

          https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix-test/detail/stack-os-matrix-test/15/tests is an example of a failure.

          My understanding from a quick glance at sconsUtils (https://github.com/lsst/sconsUtils/blob/master/python/lsst/sconsUtils/scripts.py#L208-L209) is that there is in fact no implied dependency ordering from the target list alone; any dependencies must be added separately.

          Show
          ktl Kian-Tat Lim added a comment - https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix-test/detail/stack-os-matrix-test/15/tests is an example of a failure. My understanding from a quick glance at sconsUtils ( https://github.com/lsst/sconsUtils/blob/master/python/lsst/sconsUtils/scripts.py#L208-L209 ) is that there is in fact no implied dependency ordering from the target list alone; any dependencies must be added separately.
          Hide
          tjenness Tim Jenness added a comment -

          In the end I added the directories to the ignore list when running pytest.

          There is also an extra unrelated fix to header stuff that I saw whilst testing this but I can remove that if needed. It's a gen2-specific problem where the curated calibs ingest calls normal raw data ingest which now calls fix_header and that causes and extra log message because curated calibrations say they are LATISS but they aren't raw latiss and header fixup gets confused.

          Show
          tjenness Tim Jenness added a comment - In the end I added the directories to the ignore list when running pytest. There is also an extra unrelated fix to header stuff that I saw whilst testing this but I can remove that if needed. It's a gen2-specific problem where the curated calibs ingest calls normal raw data ingest which now calls fix_header and that causes and extra log message because curated calibrations say they are LATISS but they aren't raw latiss and header fixup gets confused.
          Hide
          ktl Kian-Tat Lim added a comment -

          Looks fine. Thanks for dealing with this.

          Show
          ktl Kian-Tat Lim added a comment - Looks fine. Thanks for dealing with this.

            People

            Assignee:
            tjenness Tim Jenness
            Reporter:
            ktl Kian-Tat Lim
            Reviewers:
            Kian-Tat Lim
            Watchers:
            Christopher Waters, Kian-Tat Lim, Robert Lupton, Tim Jenness
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.