Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-10340

parent repository properties are dropped when loaded via child repositories.

    Details

      Description

      When a user runs processCcd from the command line for ISR, if outputrepo was previously a destination for another processCcd run, an opaque butler error which sneakily originates in daf_persistence will result ("No locations for get").

      I have only tried this for DECam images; it's possible this may be a special case affecting only images ingested with obs_decam. I don't generally have a problem with the requirement for an empty destination directory for output from processCcd, but the Traceback should give the user a clue as to the source of the problem.

      For example, for the case where repo, calibrepo, and outputrepo are directories with ingested images, ingested calibration products, and past output from processCcd, respectively:

      processCcd.py repo --id visit=410878 ccdnum=54 --output outputrepo --calib calibrepo -C $OBS_DECAM_DIR/config/processCcdCpIsr.py --config calibrate.doAstrometry=False calibrate.doPhotoCal=False
      

      will return an error that appears to indicate a problem with accessing the bias:

      processCcd INFO: Processing {u'hdu': 55, 'object': 'Blind15A_39', 'visit': 410878, u'filter': 'r', 'ccdnum': 54, u'date': '2015-02-17'}
      processCcd.isr INFO: Performing ISR on sensor {u'hdu': 55, 'object': 'Blind15A_39', 'visit': 410878, u'filter': 'r', 'ccdnum': 54, u'date': '2015-02-17'}
      makeWcs WARN: Stripping PVi_j keys from projection RA---TPV/DEC--TPV
      processCcd FATAL: Failed on dataId={u'hdu': 55, 'object': 'Blind15A_39', 'visit': 410878, u'filter': 'r', 'ccdnum': 54, u'date': '2015-02-17'}: Unable to retrieve cpBias for {u'hdu': 55, 'object': 'Blind15A_39', 'visit': 410878, u'filter': 'r', 'ccdnum': 54, u'date': '2015-02-17'}: No locations for get: datasetType:cpBias dataId:DataId(initialdata={u'filter': 'r', u'hdu': 55, 'ccdnum': 54, u'date': '2015-02-17', 'object': 'Blind15A_39', 'visit': 410878}, tag=set([]))
      Traceback (most recent call last):
        File "/ssd/lsstsw/stack_20170409/Linux64/pipe_base/13.0-3-g7fa07e0/python/lsst/pipe/base/cmdLineTask.py", line 347, in __call__
          result = task.run(dataRef, **kwargs)
        File "/ssd/lsstsw/stack_20170409/Linux64/pipe_base/13.0-3-g7fa07e0/python/lsst/pipe/base/timer.py", line 121, in wrapper
          res = func(self, *args, **keyArgs)
        File "/home/mrawls/pipe_tasks/python/lsst/pipe/tasks/processCcd.py", line 183, in run
          exposure = self.isr.runDataRef(sensorRef).exposure
        File "/ssd/lsstsw/stack_20170409/Linux64/pipe_base/13.0-3-g7fa07e0/python/lsst/pipe/base/timer.py", line 121, in wrapper
          res = func(self, *args, **keyArgs)
        File "/ssd/lsstsw/stack_20170409/Linux64/ip_isr/13.0-9-g733b75c+2/python/lsst/ip/isr/isrTask.py", line 484, in runDataRef
          isrData = self.readIsrData(sensorRef, ccdExposure)
        File "/home/mrawls/obs_decam/python/lsst/obs/decam/decamCpIsr.py", line 77, in readIsrData
          biasExposure = self.getIsrExposure(dataRef, "cpBias") if self.config.doBias else None
        File "/ssd/lsstsw/stack_20170409/Linux64/ip_isr/13.0-9-g733b75c+2/python/lsst/ip/isr/isrTask.py", line 586, in getIsrExposure
          raise RuntimeError("Unable to retrieve %s for %s: %s" % (datasetType, dataRef.dataId, exc1))
      RuntimeError: Unable to retrieve cpBias for {u'hdu': 55, 'object': 'Blind15A_39', 'visit': 410878, u'filter': 'r', 'ccdnum': 54, u'date': '2015-02-17'}: No locations for get: datasetType:cpBias dataId:DataId(initialdata={u'filter': 'r', u'hdu': 55, 'ccdnum': 54, u'date': '2015-02-17', 'object': 'Blind15A_39', 'visit': 410878}, tag=set([]))
      

      but the same command runs successfully when outputrepo is an empty directory.

        Attachments

          Issue Links

            Activity

            Hide
            ktl Kian-Tat Lim added a comment -

            The redesign of the repository initialization as described in the docstring is reasonable except that I believe it still breaks the common pattern of creating an output repo with one task (e.g. generating a skymap for a coadd) and then reusing that output repo while adding more output datasets (e.g. generating warped images) with different input repos. In order to handle this case, we need to allow addition of new input repos (parents) to an existing output repo. This requires that the in-repo persisted configuration be updated, which in turn requires locking to serialize access to that configuration. This needs to be added before merge to master to ensure that this use case is not broken.

            (The use case can also be supported by insisting that repos not be reused; the skymap repo would then be an input/parent to the warp repo. This requires changes to both existing tasks and user habits which will take longer.)

            While the multi-step initialization process seems complex, each step is necessary (at least for now).

            There are a few typos, misspellings, and coding issues that should be fixed as well. For example, in line 302 of butler.py, the extra parentheses should be removed and the use of the external-to-the-function variable alreadyAdded should be documented. A separate Python-only code review is recommended.

            Show
            ktl Kian-Tat Lim added a comment - The redesign of the repository initialization as described in the docstring is reasonable except that I believe it still breaks the common pattern of creating an output repo with one task (e.g. generating a skymap for a coadd) and then reusing that output repo while adding more output datasets (e.g. generating warped images) with different input repos. In order to handle this case, we need to allow addition of new input repos (parents) to an existing output repo. This requires that the in-repo persisted configuration be updated, which in turn requires locking to serialize access to that configuration. This needs to be added before merge to master to ensure that this use case is not broken. (The use case can also be supported by insisting that repos not be reused; the skymap repo would then be an input/parent to the warp repo. This requires changes to both existing tasks and user habits which will take longer.) While the multi-step initialization process seems complex, each step is necessary (at least for now). There are a few typos, misspellings, and coding issues that should be fixed as well. For example, in line 302 of butler.py , the extra parentheses should be removed and the use of the external-to-the-function variable alreadyAdded should be documented. A separate Python-only code review is recommended.
            Hide
            npease Nate Pease added a comment -

            Andy and Kenny: this code has been reviewed for design by K-T. They still need code review. Can you please do that. (Andy Salnikov I owe you some big code reviews after all this, on SuperTask or something....)

            (Kian-Tat Lim you might look at the newer commits in daf_persistence if you are able; they add RepositoryCfg parent extendability).

            Show
            npease Nate Pease added a comment - Andy and Kenny: this code has been reviewed for design by K-T. They still need code review. Can you please do that. ( Andy Salnikov I owe you some big code reviews after all this, on SuperTask or something....) ( Kian-Tat Lim you might look at the newer commits in daf_persistence if you are able; they add RepositoryCfg parent extendability).
            Hide
            salnikov Andy Salnikov added a comment -

            Please check my comments on PR, I tried to understand as much as I could, cannot say that I succeeded. Removing myself from reviewer.

            Show
            salnikov Andy Salnikov added a comment - Please check my comments on PR, I tried to understand as much as I could, cannot say that I succeeded. Removing myself from reviewer.
            Hide
            npease Nate Pease added a comment -

            K-T has finished the design review and further discussions.
            Andy reviewed the code. Kenny Lo may want to look at the PR's still, I'd be glad to receive any feedback from him, probably via email or in person is best.

            Show
            npease Nate Pease added a comment - K-T has finished the design review and further discussions. Andy reviewed the code. Kenny Lo may want to look at the PR's still , I'd be glad to receive any feedback from him, probably via email or in person is best.
            Hide
            kennylo Kenny Lo added a comment -

            Nate Pease I did scan the file changes but quickly realized the scope is much wider than expected of a bug fix. I will email you my other general comments.

            Show
            kennylo Kenny Lo added a comment - Nate Pease I did scan the file changes but quickly realized the scope is much wider than expected of a bug fix. I will email you my other general comments.

              People

              • Assignee:
                npease Nate Pease
                Reporter:
                mrawls Meredith Rawls
                Reviewers:
                Kenny Lo, Kian-Tat Lim
                Watchers:
                Andy Salnikov, Colin Slater, Dominique Boutigny, Jonathan Sick, Kenny Lo, Kian-Tat Lim, Meredith Rawls, Merlin Fisher-Levine, Nate Pease, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel