Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-33766

Photodiode test depends on other tests having run

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: obs_lsst
    • Labels:
      None
    • Story Points:
      2
    • Epic Link:
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      The nightly build failed because of a photodiode test failure. Looking at the code again it seems that the test is assuming that other tests are run first. This can not be relied upon since in some test environments the order can be randomized or with multi-process testing the test can be allocated to a subprocess that does not have any other relevant tests in it.

      The reason it was working at all is that the ingest tests set up a per-testcase butler repo for efficiency reasons and just change the output run per test. Since exposure records are written for any ingest those records then exist for the followup photodiode test.

      You can reproduce the error using pytest -k testPhotodiode tests/test_ingest.py.

      In particular the error message is not with the command-line execution but is in the butler get that runs afterwards. This suggests that the command-line is returning good exit status even though it should be complaining that it was not able to ingest the file.

      Suggestions:

      • Rewrite the test to not inherit from ingest test base but to ingest the relevant raw file explicitly.
      • Add a second test where that raw file is not ingested that should fail because there is no matching exposure record.

      Note that raw-ingest collates all the failure reasons and reports them at the end and does not fail immediately. it tries to ingest what it can first.

      I'm sorry I didn't spot this issue in my review.

        Attachments

          Issue Links

            Activity

            Show
            czw Christopher Waters added a comment - - edited https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/35946/pipeline   https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/35959/pipeline
            Hide
            tjenness Tim Jenness added a comment -

            I will reiterate here what I wrote in this ticket in that the test should be failing earlier if the file can not be ingested. It seems wrong that this is failing at the butler.get stage and not the ingest stage. Forcing a file ingest will fix the test but it won't explain why the test went wrong the way it did. Since exposure records are cached the only way to demonstrate the failure is to have a new registry – inheriting from the ingest test base class makes that impossible because it reuses a registry (and also triggers tests that don't need to be repeated). Creating the repo in setUp and not setUpClass will deal with that issue and allow a second test to run that should fail.

            If I run a photodiode ingest without the exposure I get:

            $ butler ingest-photodiode tmp LSSTCam data/input/lsstCam/raw
            lsst.photodiodeIngest WARNING: Skipping instrument LSSTCam and dayObs/seqNum 20211212 310: no exposures found.
            $ echo $?                                                    
            0
            

            with ingest-raws you get a report of the failure:

            lsst.ingest INFO: Successfully extracted metadata from 0 files with 1 failure
            lsst.ingest WARNING: Could not extract observation metadata from the following:
            lsst.ingest WARNING: - file:///Users/timj/work/lsstsw3/build/obs_lsst/data/input/latiss/raw/2018-09-20/3018092000065-det000.fits
            lsst.ingest INFO: Successfully processed data from 0 exposures with 0 failures from exposure registration and 0 failures from file ingest.
            lsst.ingest INFO: Ingested 0 distinct Butler datasets
            lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback:
            Traceback (most recent call last):
              File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/obs_base/gcaa7f91c06+fce0917272/python/lsst/obs/base/cli/cmd/commands.py", line 138, in ingest_raws
                script.ingestRaws(*args, **kwargs)
              File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/obs_base/gcaa7f91c06+fce0917272/python/lsst/obs/base/script/ingestRaws.py", line 85, in ingestRaws
                ingester.run(
              File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/utils/g63a1f4f1ec+0139a6650e/python/lsst/utils/timer.py", line 339, in timeMethod_wrapper
                res = func(self, *args, **keyArgs)
              File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/obs_base/gcaa7f91c06+fce0917272/python/lsst/obs/base/ingest.py", line 1227, in run
                raise RuntimeError("Some failures encountered during ingestion")
            RuntimeError: Some failures encountered during ingestion
            $ echo $?
            1
            

            Show
            tjenness Tim Jenness added a comment - I will reiterate here what I wrote in this ticket in that the test should be failing earlier if the file can not be ingested. It seems wrong that this is failing at the butler.get stage and not the ingest stage. Forcing a file ingest will fix the test but it won't explain why the test went wrong the way it did. Since exposure records are cached the only way to demonstrate the failure is to have a new registry – inheriting from the ingest test base class makes that impossible because it reuses a registry (and also triggers tests that don't need to be repeated). Creating the repo in setUp and not setUpClass will deal with that issue and allow a second test to run that should fail. If I run a photodiode ingest without the exposure I get: $ butler ingest-photodiode tmp LSSTCam data/input/lsstCam/raw lsst.photodiodeIngest WARNING: Skipping instrument LSSTCam and dayObs/seqNum 20211212 310: no exposures found. $ echo $? 0 with ingest-raws you get a report of the failure: lsst.ingest INFO: Successfully extracted metadata from 0 files with 1 failure lsst.ingest WARNING: Could not extract observation metadata from the following: lsst.ingest WARNING: - file:///Users/timj/work/lsstsw3/build/obs_lsst/data/input/latiss/raw/2018-09-20/3018092000065-det000.fits lsst.ingest INFO: Successfully processed data from 0 exposures with 0 failures from exposure registration and 0 failures from file ingest. lsst.ingest INFO: Ingested 0 distinct Butler datasets lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback: Traceback (most recent call last): File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/obs_base/gcaa7f91c06+fce0917272/python/lsst/obs/base/cli/cmd/commands.py", line 138, in ingest_raws script.ingestRaws(*args, **kwargs) File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/obs_base/gcaa7f91c06+fce0917272/python/lsst/obs/base/script/ingestRaws.py", line 85, in ingestRaws ingester.run( File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/utils/g63a1f4f1ec+0139a6650e/python/lsst/utils/timer.py", line 339, in timeMethod_wrapper res = func(self, *args, **keyArgs) File "/Users/timj/work/lsstsw3/stack/lsst-scipipe-1.0.0/Darwin/obs_base/gcaa7f91c06+fce0917272/python/lsst/obs/base/ingest.py", line 1227, in run raise RuntimeError("Some failures encountered during ingestion") RuntimeError: Some failures encountered during ingestion $ echo $? 1
            Hide
            tjenness Tim Jenness added a comment -

            Sorry I didn't get this review in before the weekly.

            Thanks for doing the reorganization. Minor comments on the PR.

            Show
            tjenness Tim Jenness added a comment - Sorry I didn't get this review in before the weekly. Thanks for doing the reorganization. Minor comments on the PR.

              People

              Assignee:
              czw Christopher Waters
              Reporter:
              tjenness Tim Jenness
              Reviewers:
              Tim Jenness
              Watchers:
              Christopher Waters, Jim Bosch, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.