Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-20695

Coadd in Gen3 doesn't have a Selector like the one in Gen2

    XMLWordPrintable

    Details

    • Story Points:
      6
    • Epic Link:
    • Sprint:
      DRP S21a (Dec Jan)
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      Two patches (patch=28 & patch=72) in Gen3 assembleCoadd HSC-RC2 tract=9615 ran into the following error

      "/software/lsstsw/stack_20190330/stack/miniconda3-4.5.12-1172c30/Linux64/meas_algorithms/18.0.0-5-ga38416e7+2/python/lsst/meas/algorithms/detection.py", line 777, in detectFootprints
          psf = self.getPsf(exposure, sigma=sigma)
        File "/software/lsstsw/stack_20190330/stack/miniconda3-4.5.12-1172c30/Linux64/meas_algorithms/18.0.0-5-ga38416e7+2/python/lsst/meas/algorithms/detection.py", line 503, in getPsf
          raise RuntimeError("Unable to determine PSF to use for detection: no sigma provided")
      RuntimeError: Unable to determine PSF to use for detection: no sigma provided
      

      Yusra AlSayyad identified that the Gen3 version doesn't have a Selector like how it is in Gen2. For patch=28 (Gen3) or '1,3' (Gen2), visit=27116 isn't selected for coadd and that warp isn't made in Gen2.
      From Slack:

      I’m 90% sure that the selector in `CoaddDriver` https://github.com/lsst/pipe_drivers/blob/master/python/lsst/pipe/drivers/coaddDriver.py#L268, (retargeted to `from lsst.pipe.tasks.selectImages import PsfWcsSelectImagesTask`) but am just shoving the dataRefs into the right dataId container that the selector likes… hang on

      Confirmed. `PsfWcsSelectImagesTask` filters out visit 27116 in `coaddDriver` and the Gen2 version of assembleCoadd if you ask it to use that selector.

       
      And attaching her useful snippet from Slack.

      One way to reproduce the error in Gen3 (but likely with Oracle permission issue) is 

      pipetask -b /project/hchiang2/gen3repos/w_2019_20/repo/butler.yaml -d "tract=9615 and patch=28 and abstract_filter='z'" -i G3M19c_000001_021 -o dummy run --skip-init-writes  -t "assembleCoadd.CompareWarpAssembleCoaddTask"
      

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            Yusra AlSayyad, do you know if this was taken care of in your last round of changes to the PipelineTask version of AssembleCoadd et al?

            Show
            jbosch Jim Bosch added a comment - Yusra AlSayyad , do you know if this was taken care of in your last round of changes to the PipelineTask version of AssembleCoadd et al?
            Hide
            jbosch Jim Bosch added a comment -

            Discussed this in slack, and then I took a closer look at the code.  We do still have work to do, because the selections we do in Gen2 can't just be written as data query expressions in Gen3.  In particular:

            • AP uses BestSeeingWcsSelectImagesTask, which loads PSF models from calexps and computes sizes from them on the fly.  We might eventually start recording that as per-DatasetType metadata in the Registry, and then we might be able to use data query expressions, but not yet (this is in part DM-21774).
            • DRP usually uses PsfWcsSelectImagesTask, which computes a rather complex PSF residual metric from src catalogs.  It's less obvious that this is something we should put in general-purpose calexp metadata, though that is a possibility.  Another option might be to convert this calculation into a true lsst.verify metric Measurement, and (perhaps on DM-21875) use a special Datatore to put those into the same database that holds the registry, and then add Registry support for including those metric external tables in queries.  That's worth doing for its own sake eventually, but it's a lot of work to get coaddition working as it does in Gen2.

            More importantly, trying to use data query expressions to solve either of these has a much bigger problem: the information needed to make the selection fundamentally is not available until after calexps are created, but we may want to generate a QuantumGraph that starts from raws, and in that case we'd need to evaluate the data query expression before the fields it uses will exist (unless we base the selection on a different run, which seems both complex and dangerous).

            The alternative is to continue to make selection for coaddition something that happens in Python.  The SelectImageTasks are (in Gen2) subtasks that nevertheless use the Butler, but this is problematic in Gen3, especially because they don't necessarily all use the same inputs (BestSeeingWcsSelectImagesTask uses only calexp, while PsfWcsSelectImagesTask also uses src).  We don't currently have any way for a PipelineTask to pull connections from a subtask, and I'm not sure we should invent one.

            Instead, I think it'd be best to make the Gen3 equivalents of these tasks PipelineTasks in their own right that would take whatever inputs they wished and output an ExposureCatalog of selected images to be (perhaps optionally) consumed by AssembleCoadd and perhaps MakeWarpTask.

            Finally, it's worth pointing out that the "Wcs" part of these selection tasks - which checks for overlap between the detector-level image and patch - should be unnecessary in Gen3 - the middleware will already perform a logically equivalent operation.  The calculation may be subtly different, thought, because the Gen2 version does the overlap in pixel coordinates while the Gen3 version operates in spherical coordinates.  Both are checked/refined later when warping, but we may want to retain the Gen2-style check in Gen3 during the middleware transition to make it easier to compare results, and then drop it after the Gen3 tasks are fully validated and the Gen2 ones are retired.

            Show
            jbosch Jim Bosch added a comment - Discussed this in slack, and then I took a closer look at the code.  We d o still have work to do, because the selections we do in Gen2 can't just be written as data query expressions in Gen3.  In particular: AP uses BestSeeingWcsSelectImagesTask, which loads PSF models from calexps and computes sizes from them on the fly.  We might eventually start recording that as per-DatasetType metadata in the Registry, and then we might be able to use data query expressions, but not yet (this is in part DM-21774 ). DRP usually uses PsfWcsSelectImagesTask, which computes a rather complex PSF residual metric from src catalogs.  It's less obvious that this is something we should put in general-purpose calexp metadata, though that is a possibility.  Another option might be to convert this calculation into a true lsst.verify metric Measurement , and (perhaps on DM-21875 ) use a special Datatore to put those into the same database that holds the registry, and then add Registry support for including those metric external tables in queries.  That's worth doing for its own sake eventually, but it's a lot of work to get coaddition working as it does in Gen2. More importantly, trying to use data query expressions to solve either of these has a much bigger problem: the information needed to make the selection fundamentally is not available until after calexps are created, but we may want to generate a QuantumGraph that starts from raws, and in that case we'd need to evaluate the data query expression before the fields it uses will exist (unless we base the selection on a different run, which seems both complex and dangerous). The alternative is to continue to make selection for coaddition something that happens in Python.  The SelectImageTasks are (in Gen2) subtasks that nevertheless use the Butler, but this is problematic in Gen3, especially because they don't necessarily all use the same inputs (BestSeeingWcsSelectImagesTask uses only calexp, while PsfWcsSelectImagesTask also uses src).  We don't currently have any way for a PipelineTask to pull connections from a subtask, and I'm not sure we should invent one. Instead, I think it'd be best to make the Gen3 equivalents of these tasks PipelineTasks in their own right that would take whatever inputs they wished and output an ExposureCatalog of selected images to be (perhaps optionally) consumed by AssembleCoadd and perhaps MakeWarpTask. Finally, it's worth pointing out that the "Wcs" part of these selection tasks - which checks for overlap between the detector-level image and patch - should be unnecessary in Gen3 - the middleware will already perform a logically equivalent operation.  The calculation may be subtly different, thought, because the Gen2 version does the overlap in pixel coordinates while the Gen3 version operates in spherical coordinates.  Both are checked/refined later when warping, but we may want to retain the Gen2-style check in Gen3 during the middleware transition to make it easier to compare results, and then drop it after the Gen3 tasks are fully validated and the Gen2 ones are retired.
            Hide
            yusra Yusra AlSayyad added a comment -

            OK, ready for review.
            Passed jenkins: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/33619/pipeline
            Hits pipe_tasks and adds this to the HSC pipeline. Needed for the w_2020_10 monthly run. w_2020_10 is tagged in a week.

            Meredith Rawls You came to mind first, but WARNING this isn't the BestSeeingWcsSelector that you've been waiting for.
            Because a makeWarp quantum is per-visit, it can only select at the ccd-level. It operates on a list of ccds for one single visit.

            I converted the BestSeeingWcsSelectImageTask that was in there, but it does not make any sense at all. Sorting the 20 ccds in a single visit that might overlap a patch doesn't give you the best seeing visits. That comes next.

            I opened a new ticket for the visit-level selector that'll be run before coaddition: DM-28953.

            I'm going to ask you to review that future ticket, but this one is yours to review too if you want it. Let's talk tomorrow on the diffim standup.

            Show
            yusra Yusra AlSayyad added a comment - OK, ready for review. Passed jenkins: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/33619/pipeline Hits pipe_tasks and adds this to the HSC pipeline. Needed for the w_2020_10 monthly run. w_2020_10 is tagged in a week. Meredith Rawls You came to mind first, but WARNING this isn't the BestSeeingWcsSelector that you've been waiting for. Because a makeWarp quantum is per-visit, it can only select at the ccd-level . It operates on a list of ccds for one single visit. I converted the BestSeeingWcsSelectImageTask that was in there, but it does not make any sense at all. Sorting the 20 ccds in a single visit that might overlap a patch doesn't give you the best seeing visits. That comes next. I opened a new ticket for the visit-level selector that'll be run before coaddition: DM-28953 . I'm going to ask you to review that future ticket, but this one is yours to review too if you want it. Let's talk tomorrow on the diffim standup.
            Hide
            yusra Yusra AlSayyad added a comment -

            Meredith Rawls you're off the hook, Eric offered to review this during the diffim meeting. Thanks Eric!

            Show
            yusra Yusra AlSayyad added a comment - Meredith Rawls you're off the hook, Eric offered to review this during the diffim meeting. Thanks Eric!
            Hide
            ebellm Eric Bellm added a comment -

            Looks good to me, Yusra AlSayyad--just one tiny typo I caught.

            Show
            ebellm Eric Bellm added a comment - Looks good to me, Yusra AlSayyad --just one tiny typo I caught.
            Hide
            yusra Yusra AlSayyad added a comment -

            And note to future self on testing:

            Using bps, ran 5 bands 9615 and 9697:
            /project/yusra/party/replicateRC2.yaml
            logs in /project/yusra/party/submit
            and confirmed that the directWarp counts for each of the tract/filter combos all matched the gen2 rerun converted to gen3. e.g.

            dataId = {'skymap': 'hsc_rings_v1', 'tract': 9697, 'band': 'y'}
            print(len(set(butler.registry.queryDatasets("deepCoadd_directWarp", collections=['u/yusra/DM-20695_NoEmpty',],
                                                                 dataId=dataId
                                                                ))))
            print(len(set(butler.registry.queryDatasets("deepCoadd_directWarp", collections=['HSC/runs/RC2/w_2021_02',],
                                                                 dataId=dataId
                                                                ))))
            813
            813
            

            Merged to master last week.

            Show
            yusra Yusra AlSayyad added a comment - And note to future self on testing: Using bps, ran 5 bands 9615 and 9697: /project/yusra/party/replicateRC2.yaml logs in /project/yusra/party/submit and confirmed that the directWarp counts for each of the tract/filter combos all matched the gen2 rerun converted to gen3. e.g. dataId = {'skymap': 'hsc_rings_v1', 'tract': 9697, 'band': 'y'} print(len(set(butler.registry.queryDatasets("deepCoadd_directWarp", collections=['u/yusra/DM-20695_NoEmpty',], dataId=dataId )))) print(len(set(butler.registry.queryDatasets("deepCoadd_directWarp", collections=['HSC/runs/RC2/w_2021_02',], dataId=dataId )))) 813 813 Merged to master last week.

              People

              Assignee:
              yusra Yusra AlSayyad
              Reporter:
              hchiang2 Hsin-Fang Chiang
              Reviewers:
              Eric Bellm
              Watchers:
              Eric Bellm, Hsin-Fang Chiang, Jim Bosch, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.