Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-9588

validate_drp broken on cfht/hsc datasets

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: QA
    • Labels:
      None

      Description

      validate_drp has been failing since the 22nd. This is suspiciously coincidental with the merger of https://github.com/lsst-sqre/jenkins-dm-jobs/pull/58 . The first build failure appears to be shell script related but the most recent failures for both the cfht and hsc data set look like they may be the result of a change in the stack.

      HSC

      https://ci.lsst.codes/job/validate_drp/835/dataset=hsc,label=centos-7,python=py2/console

      Traceback (most recent call last):
        File "/home/jenkins-slave/workspace/validate_drp/dataset/hsc/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/bin/validateDrp.py", line 97, in <module>
          validate.run(args.repo, **kwargs)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/hsc/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/validate.py", line 104, in run
          **kwargs)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/hsc/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/validate.py", line 217, in runOneFilter
          job=job, linkedBlobs=linkedBlobs, verbose=verbose)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/hsc/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/calcsrd/amx.py", line 159, in __init__
          verbose=verbose)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/hsc/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/calcsrd/amx.py", line 242, in calcRmsDistances
          visit[obj2], ra[obj2], dec[obj2])
        File "/home/jenkins-slave/workspace/validate_drp/dataset/hsc/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/calcsrd/amx.py", line 326, in matchVisitComputeDistance
          j = visit_obj2_idx[j_raw]
      IndexError: index 3 is out of bounds for axis 0 with size 3
      Build step 'Execute shell' marked build as failure
      [PostBuildScript] - Execution post build scripts.
      

      CFHT

      https://ci.lsst.codes/job/validate_drp/835/dataset=cfht,label=centos-7,python=py2/console

      Traceback (most recent call last):
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/bin/validateDrp.py", line 97, in <module>
          validate.run(args.repo, **kwargs)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/validate.py", line 104, in run
          **kwargs)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/validate.py", line 204, in runOneFilter
          verbose=verbose)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/matchreduce.py", line 147, in __init__
          repo, dataIds, matchRadius)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/validate_drp/master-g3511a1277e+1/python/lsst/validate/drp/matchreduce.py", line 229, in _loadAndMatchCatalogs
          oldSrc = butler.get('src', vId, immediate=True, flags=SOURCE_IO_NO_FOOTPRINTS)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/daf_persistence/12.1-19-gd507bfc/python/lsst/daf/persistence/butler.py", line 845, in get
          location = self._locate(datasetType, dataId, write=False)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/daf_persistence/12.1-19-gd507bfc/python/lsst/daf/persistence/butler.py", line 795, in _locate
          location = repoData.repo.map(datasetType, dataId, write=write)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/daf_persistence/12.1-19-gd507bfc/python/lsst/daf/persistence/repository.py", line 198, in map
          loc = self._mapper.map(*args, **kwargs)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/daf_persistence/12.1-19-gd507bfc/python/lsst/daf/persistence/mapper.py", line 144, in map
          return func(self.validate(dataId), write)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/obs_base/12.1-21-gbdb6c2a+2/python/lsst/obs/base/cameraMapper.py", line 379, in mapClosure
          return mapping.map(mapper, dataId, write)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/obs_base/12.1-21-gbdb6c2a+2/python/lsst/obs/base/mapping.py", line 124, in map
          actualId = self.need(iter(self.keyDict.keys()), dataId)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/obs_base/12.1-21-gbdb6c2a+2/python/lsst/obs/base/mapping.py", line 257, in need
          lookups = self.lookup(newProps, newId)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/obs_base/12.1-21-gbdb6c2a+2/python/lsst/obs/base/mapping.py", line 221, in lookup
          result = self.registry.lookup(properties, self.tables, lookupDataId, template=self.template)
        File "/home/jenkins-slave/workspace/validate_drp/dataset/cfht/label/centos-7/python/py2/lsstsw/stack/Linux64/daf_persistence/12.1-19-gd507bfc/python/lsst/daf/persistence/registries.py", line 330, in lookup
          c = self.conn.execute(cmd, valueList)
      sqlite3.OperationalError: no such column: flags
      

        Attachments

          Issue Links

            Activity

            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Thanks, Joshua Hoblitt

            These look like they're from DM-5819.

            The HSC error is my fault. I clearly made a mistake in:

                    while (visit_obj2[j] < visit_obj1[i]) and (j_raw < len(visit_obj2_idx)):
                        j_raw += 1
                        j = visit_obj2_idx[j_raw]
            

            Note the increment of `j_raw` after the check and then the index. This triggers the first error for HSC. I'll fix this.

            2. The CFHT error appers to be a change in the accepted syntax for the butler.get call. Nate Pease: has the flags option been removed from the butler.bet functionality?

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Thanks, Joshua Hoblitt These look like they're from DM-5819 . The HSC error is my fault. I clearly made a mistake in: while (visit_obj2[j] < visit_obj1[i]) and (j_raw < len(visit_obj2_idx)): j_raw += 1 j = visit_obj2_idx[j_raw] Note the increment of `j_raw` after the check and then the index. This triggers the first error for HSC. I'll fix this. 2. The CFHT error appers to be a change in the accepted syntax for the butler.get call. Nate Pease : has the flags option been removed from the butler.bet functionality?
            Hide
            jhoblitt Joshua Hoblitt added a comment -

            The build #831 failure is almost certainly caused by the change to lzma compress the result files. #832 fails due to what looks like butler issues. The version of daf_persistence is the same between both builds so it might be related to afw 3ddd237.

            Show
            jhoblitt Joshua Hoblitt added a comment - The build #831 failure is almost certainly caused by the change to lzma compress the result files. #832 fails due to what looks like butler issues. The version of daf_persistence is the same between both builds so it might be related to afw 3ddd237 .
            Hide
            jhoblitt Joshua Hoblitt added a comment - - edited

            I reverted the lzma compression commit since it was trivial. https://github.com/lsst-sqre/jenkins-dm-jobs/pull/60 I will add it back/fix it after the plumbing is unplugged.

            Show
            jhoblitt Joshua Hoblitt added a comment - - edited I reverted the lzma compression commit since it was trivial. https://github.com/lsst-sqre/jenkins-dm-jobs/pull/60 I will add it back/fix it after the plumbing is unplugged.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Fixes up errors introduced in DM-5819
            1. Fix indexing check error.
            2. Remove `flags=SOURCE_NO_IO_FOOTPRINTS` because it only works in `obs_subaru` and fails loudly for other cameras.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Fixes up errors introduced in DM-5819 1. Fix indexing check error. 2. Remove `flags=SOURCE_NO_IO_FOOTPRINTS` because it only works in `obs_subaru` and fails loudly for other cameras.
            Hide
            krughoff Simon Krughoff added a comment -

            No comments. Merge away.

            Show
            krughoff Simon Krughoff added a comment - No comments. Merge away.

              People

              • Assignee:
                wmwood-vasey Michael Wood-Vasey
                Reporter:
                jhoblitt Joshua Hoblitt
                Reviewers:
                Simon Krughoff
                Watchers:
                Angelo Fausti, Jonathan Sick, Joshua Hoblitt, Michael Wood-Vasey, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel