Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-2473

Tests in meas_astrom trigger race conditions in EUPS

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: eups, meas_algorithms
    • Labels:
      None
    • Templates:
    • Team:
      Data Release Production

      Description

      I tried to build the stack from scratch, and got a failure in one of the meas_astrom tests (attached). It's similar to the DM-2303 failures, where Eups aborts because the cache pickle file is invalid.

      This happens because:
      a) tests in meas_astrom are run concurrently (good)
      b) at least 11 of them instantiate EUPS (ok)
      c) on every creation of an instance of Eups class, it checks whether it's Products cache is valid; if it isn't it rebuilds it and writes it out to the disk for later reading. Because the write is non-atomic, this creates a race condition where if another instance of Eups tries to read the cache file before the writing has finished, it will see it as corrupted. And that is what happens here.

      While there are known issues w. EUPS locking, I think this kind of race should not exist even in the absence of locking (i.e., readers should not have to lock). The fix is to make the write atomic, using the usual write-to-temporary-and-then-move pattern. In pseudocode:

      fp = NamedTemporaryFile(delete=False, dir=cachedir)
      pickle.dump(fp, cache)
      fp.close()
      os.rename(fp.name, cacneFn)

      Also, the caching code should not abort on invalid cache, but fall back to rebuilding the cache instead (and maybe issue a warning).

      Once this implemented I think it may be possible to remove the workaround developed for DM-2303.

        Attachments

          Activity

            People

            • Assignee:
              rhl Robert Lupton
              Reporter:
              mjuric Mario Juric
              Reviewers:
              Robert Lupton
              Watchers:
              Jim Bosch, Joshua Hoblitt, Mario Juric, Robert Lupton, Russell Owen
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Summary Panel