Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-23616

Run converted ap_verify testdata through gen3 pipeline

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ap_verify, obs_decam
    • Labels:
      None
    • Story Points:
      2
    • Sprint:
      AP S20-3 (February), AP S20-4 (March)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      To more fully test DM-22655, we should run as much of the gen3 pipeline as we can on a converted ap_verify test dataset. As a first test, lets run on ap_verify_ci_hits, instead of the much larger non-ci dataset.

      I'll provide the command to run in a comment. You'll need to get three packages on branch tickets/DM-22655 and scons'd: daf_butler, obs_base, and obs_decam. If you setup ap_verify_ci_hits2015 before sconsing obs_decam, it will run a few extra tests.

        Attachments

          Issue Links

            Activity

            Hide
            Parejkoj John Parejko added a comment -

            I'm having trouble with fgcmcal, but I think that's orthogonal to what this ticket would test.

            Once you have the above packages setup and scons'd, this command should convert an ingested set of ap_verify data (in $AP_VERIFY_GEN2) to a gen3 repo (in $AP_VERIFY_GEN3) that you can then run a gen3 pipeline on. Please let me know if you have any questions.

            convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root $AP_VERIFY_GEN2/ingested --gen3root $AP_VERIFY_GEN3 --calibs $AP_VERIFY_GEN2/calibingested --calibFilterType abstract_filter
            

            Show
            Parejkoj John Parejko added a comment - I'm having trouble with fgcmcal, but I think that's orthogonal to what this ticket would test. Once you have the above packages setup and scons'd, this command should convert an ingested set of ap_verify data (in $AP_VERIFY_GEN2) to a gen3 repo (in $AP_VERIFY_GEN3) that you can then run a gen3 pipeline on. Please let me know if you have any questions. convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root $AP_VERIFY_GEN2/ingested --gen3root $AP_VERIFY_GEN3 --calibs $AP_VERIFY_GEN2/calibingested --calibFilterType abstract_filter
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Using the ticket branches of obs_decam and ap_verify_ci_hits2015, I got the following error:

            Failed to build graph: DatasetType 'panstarrs' referenced by CalibrateConnections uses 'skypix' as a dimension placeholder, but does not already exist in the registry.  Note that reference catalog names are now used as the dataset type name instead of 'ref_cat'.
            Traceback (most recent call last):
              File "/software/lsstsw/stack_20200220/stack/miniconda3-4.7.12-984c9f7/Linux64/pipe_base/19.0.0-9-g0ae078d+2/python/lsst/pipe/base/pipeline.py", line 461, in makeDatasetTypesSet
                datasetType = registry.getDatasetType(c.name)
              File "/scratch/krzys001/daf_butler/python/lsst/daf/butler/registry/_registry.py", line 508, in getDatasetType
                raise KeyError("Could not find entry for datasetType {}".format(name))
            KeyError: 'Could not find entry for datasetType panstarrs'
            

            The panstarrs refcat is included in the Gen 2 repo, but missing from the Gen 3 repo.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Using the ticket branches of obs_decam and ap_verify_ci_hits2015 , I got the following error: Failed to build graph: DatasetType 'panstarrs' referenced by CalibrateConnections uses 'skypix' as a dimension placeholder, but does not already exist in the registry. Note that reference catalog names are now used as the dataset type name instead of 'ref_cat'. Traceback (most recent call last): File "/software/lsstsw/stack_20200220/stack/miniconda3-4.7.12-984c9f7/Linux64/pipe_base/19.0.0-9-g0ae078d+2/python/lsst/pipe/base/pipeline.py", line 461, in makeDatasetTypesSet datasetType = registry.getDatasetType(c.name) File "/scratch/krzys001/daf_butler/python/lsst/daf/butler/registry/_registry.py", line 508, in getDatasetType raise KeyError("Could not find entry for datasetType {}".format(name)) KeyError: 'Could not find entry for datasetType panstarrs' The panstarrs refcat is included in the Gen 2 repo, but missing from the Gen 3 repo.
            Hide
            Parejkoj John Parejko added a comment -

            Looks like I need to test that the refcat got converted.

            Show
            Parejkoj John Parejko added a comment - Looks like I need to test that the refcat got converted.
            Hide
            Parejkoj John Parejko added a comment -

            I've added a test of refcat conversion: the problem was a missing config defining the refcats to be converted.

            While we figure out where to more officially put that, Krzysztof Findeisen: can you please try your test again, pulling down the latest obs_decam on that branch and adding -c $OBS_DECAM_DIR/tests/config/convert2to3Config.py to your commandline?

            Show
            Parejkoj John Parejko added a comment - I've added a test of refcat conversion: the problem was a missing config defining the refcats to be converted. While we figure out where to more officially put that, Krzysztof Findeisen : can you please try your test again, pulling down the latest obs_decam on that branch and adding -c $OBS_DECAM_DIR/tests/config/convert2to3Config.py to your commandline?
            Hide
            Parejkoj John Parejko added a comment -

            I reversed the "blocking" direction on DM-22655, so that we could get it merged. Now you can run these tests against master, and don't have to worry about keeping up with my changes. Please continue to let me know if you find things that look like bugs in the conversion (instead of in the pipetasks or their configurations), and I'll fix them on this branch.

            Show
            Parejkoj John Parejko added a comment - I reversed the "blocking" direction on DM-22655 , so that we could get it merged. Now you can run these tests against master, and don't have to worry about keeping up with my changes. Please continue to let me know if you find things that look like bugs in the conversion (instead of in the pipetasks or their configurations), and I'll fix them on this branch.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Using master versions of daf_butler, ip_isr, and obs_base, and tickets/DM-23616 versions of obs_decam and ap_verify_ci_hits2015, DecamCrosstalkTask fails because no crosstalk sources are passed to run. It appears that the code for providing these sources in Gen 3 is deliberately commented out, but it's not clear how to fix this or who is responsible for doing so.

            See DM-17169 and the Slack discussion for technical details.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Using master versions of daf_butler , ip_isr , and obs_base , and tickets/ DM-23616 versions of obs_decam and ap_verify_ci_hits2015 , DecamCrosstalkTask fails because no crosstalk sources are passed to run . It appears that the code for providing these sources in Gen 3 is deliberately commented out , but it's not clear how to fix this or who is responsible for doing so. See DM-17169 and the Slack discussion for technical details.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            If I keep running with --config isr:doCrosstalk=False, I next get a failure because the ISR's Linearizer object has a None table. Christopher Waters says we would need to create a linearizer dataset (like with defects) in any Gen 3 repo. John Parejko, can your converter turn the Gen 2 linearizer into a dataset, or else is it possible to set up the defer-to-camerageom system HSC has?

            Show
            krzys Krzysztof Findeisen added a comment - - edited If I keep running with --config isr:doCrosstalk=False , I next get a failure because the ISR's Linearizer object has a None table. Christopher Waters says we would need to create a linearizer dataset (like with defects) in any Gen 3 repo. John Parejko , can your converter turn the Gen 2 linearizer into a dataset, or else is it possible to set up the defer-to-camerageom system HSC has?
            Hide
            tjenness Tim Jenness added a comment -

            I think DM-23044 is working on the persistence problem.

            Show
            tjenness Tim Jenness added a comment - I think DM-23044 is working on the persistence problem.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Current steps to reproduce on lsst-dev (does not include disabling any unsupported ISR steps):

            setup lsst_distrib
            setup -vkr /scratch/krzys001/obs_decam/  # DM-23616 ticket branch
             
            # Create small Gen 2 repositories
            setup -vkr /project/krzys001/ap_verify_ci_hits2015/
            ingest_dataset.py --dataset CI-HiTS2015 --output gen3demo
             
            # Convert to Gen 3
            cd gen3demo/
            convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root `pwd`/ingested/ --gen3root `pwd`/gen3/ --calibs `pwd`/calibingested/ --calibFilterType abstract_filter --config config/convertRepo.py
             
            # Run ProcessCcd (first time)
            pipetask run --input raw/DECam,calib/DECam,refcats,skymaps --output processed --instrument lsst.obs.decam.DarkEnergyCamera --butler-config gen3/butler.yaml --pipeline ${PIPE_TASKS_DIR}/pipelines/ProcessCcd.yaml --configfile calibrate:config/calibrate.py --register-dataset-types
             
            # Run ProcessCcd (subsequently)
            pipetask run --output processed --instrument lsst.obs.decam.DarkEnergyCamera --butler-config gen3/butler.yaml --pipeline ${PIPE_TASKS_DIR}/pipelines/ProcessCcd.yaml --configfile calibrate:config/calibrate.py --register-dataset-types --replace-run
            

            Show
            krzys Krzysztof Findeisen added a comment - - edited Current steps to reproduce on lsst-dev (does not include disabling any unsupported ISR steps): setup lsst_distrib setup -vkr /scratch/krzys001/obs_decam/ # DM-23616 ticket branch   # Create small Gen 2 repositories setup -vkr /project/krzys001/ap_verify_ci_hits2015/ ingest_dataset.py --dataset CI-HiTS2015 --output gen3demo   # Convert to Gen 3 cd gen3demo/ convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root `pwd`/ingested/ --gen3root `pwd`/gen3/ --calibs `pwd`/calibingested/ --calibFilterType abstract_filter --config config/convertRepo.py   # Run ProcessCcd (first time) pipetask run --input raw/DECam,calib/DECam,refcats,skymaps --output processed --instrument lsst.obs.decam.DarkEnergyCamera --butler-config gen3/butler.yaml --pipeline ${PIPE_TASKS_DIR}/pipelines/ProcessCcd.yaml --configfile calibrate:config/calibrate.py --register-dataset-types   # Run ProcessCcd (subsequently) pipetask run --output processed --instrument lsst.obs.decam.DarkEnergyCamera --butler-config gen3/butler.yaml --pipeline ${PIPE_TASKS_DIR}/pipelines/ProcessCcd.yaml --configfile calibrate:config/calibrate.py --register-dataset-types --replace-run
            Hide
            krzys Krzysztof Findeisen added a comment -

            I've managed to run the pipeline through to completion. In addition to the above command lines, I had to pass the following configuration flags:

            • --config isr:doCrosstalk=False: workaround for DM-23983
            • --config isr:doLinearize=False: workaround for DM-23985
            • --config calibrate:doAstrometry=False --config calibrate:doPhotoCal=False: workaround for DM-23992
            Show
            krzys Krzysztof Findeisen added a comment - I've managed to run the pipeline through to completion. In addition to the above command lines, I had to pass the following configuration flags: --config isr:doCrosstalk=False : workaround for DM-23983 --config isr:doLinearize=False : workaround for DM-23985 --config calibrate:doAstrometry=False --config calibrate:doPhotoCal=False : workaround for DM-23992
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Code changes made on this ticket (there's enough of them that even I'm confused):

            • obs_decam (70 lines)
              • refactored processCcd.py into characterizeImage.py and calibrate.py, allowing these tasks to benefit from automatic overrides in Gen 3.
              • standardized the obs_decam ISR overrides, deprecating processCcdCpIsr.py in the process.
              • added connections overrides to isr.py
              • fixed various bugs
            • ap_verify_hits2015, ap_verify_ci_hits2015, ap_pipe_testdata (96 lines)
              • refactored apPipe.py into calibrate.py and imageDifference.py, allowing these tasks to be overridden when running a pipeline on the corresponding data set
              • added convertRepo.py for configuring conversion of the ingested dataset to Gen 3
            • pipe_tasks (188 lines)
              • fixed some I/O bugs in CalibrateTask for the case where doAstrometry=False
            • pipe_base (77 lines)
              • fixed some bugs in the PipelineTask unit test framework discovered while testing CalibrateTask
            Show
            krzys Krzysztof Findeisen added a comment - - edited Code changes made on this ticket (there's enough of them that even I'm confused): obs_decam (70 lines) refactored processCcd.py into characterizeImage.py and calibrate.py , allowing these tasks to benefit from automatic overrides in Gen 3. standardized the obs_decam ISR overrides, deprecating processCcdCpIsr.py in the process. added connections overrides to isr.py fixed various bugs ap_verify_hits2015 , ap_verify_ci_hits2015 , ap_pipe_testdata (96 lines) refactored apPipe.py into calibrate.py and imageDifference.py , allowing these tasks to be overridden when running a pipeline on the corresponding data set added convertRepo.py for configuring conversion of the ingested dataset to Gen 3 pipe_tasks (188 lines) fixed some I/O bugs in CalibrateTask for the case where doAstrometry=False pipe_base (77 lines) fixed some bugs in the PipelineTask unit test framework discovered while testing CalibrateTask
            Hide
            krzys Krzysztof Findeisen added a comment -

            Hi John Parejko, would you be willing to review the code changes? I think they're a bit more than we expected from "a first test"...

            Show
            krzys Krzysztof Findeisen added a comment - Hi John Parejko , would you be willing to review the code changes? I think they're a bit more than we expected from "a first test"...
            Hide
            Parejkoj John Parejko added a comment -

            See comments on the PRs.

            Show
            Parejkoj John Parejko added a comment - See comments on the PRs.

              People

              • Assignee:
                krzys Krzysztof Findeisen
                Reporter:
                Parejkoj John Parejko
                Reviewers:
                John Parejko
                Watchers:
                Christopher Waters, Eric Bellm, Ian Sullivan, John Parejko, Krzysztof Findeisen, Meredith Rawls, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel