Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-21939

Create Gen 3 AP Pipeline

    Details

    • Story Points:
      4
    • Sprint:
      AP F19-6 (November), AP S20-6 (May), AP F20-1 (June)
    • Team:
      Alert Production

      Description

      This is a mostly-umbrella ticket for getting all current Gen2 ap_pipe functionality working in Gen3. The goal is being able to run AP on HiTS 2015 using the CmdLineActivator. It includes:

      • creating scripts that ingest HiTS test data into a Gen3 repository (DM-21862)
      • conversion of all remaining ApPipeTask subtasks to Gen 3 (DM-21874, DM-21886)
      • creation of a YAML file configuring the pipeline (1 SP on this ticket)

      ApPipeTask has a lot of code for handling the case of calexp difference imaging templates. By general agreement within the AP and Middleware groups, we will not be porting this functionality to Gen 3 (see also DM-21874), and will reimplement it from scratch should we need it after migration.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment -

            DM-21915 would be of interest to people trying to use ap_pipe in Gen 3, but is not actually necessary given DM-21862.

            Show
            krzys Krzysztof Findeisen added a comment - DM-21915 would be of interest to people trying to use ap_pipe in Gen 3, but is not actually necessary given DM-21862 .
            Hide
            swinbank John Swinbank added a comment -

            This ticket is cast in terms of support for obs_decam, and hence is blocked by DM-21862. However, an equally-good success criterion would be to get the pipeline working with HSC, and hence DM-24260 is also a blocker. Either one of those would be fine!

            Show
            swinbank John Swinbank added a comment - This ticket is cast in terms of support for obs_decam, and hence is blocked by DM-21862 . However, an equally-good success criterion would be to get the pipeline working with HSC, and hence DM-24260 is also a blocker. Either one of those would be fine!
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Steps to create Gen 3 test repo on lsst-dev:

            ingest_dataset.py --dataset CI-HiTS2015 --output hits2015_gen2
            # Create a convertRepo_standalone.py that combines CI dataset's convertRepo_calibs and convertRepo_copied (and does not exclude raws).
            convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root `pwd`/hits2015_gen2/ingested/ --calibs `pwd`/hits2015_gen2/calibingested/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_standalone.py
            convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root ${AP_VERIFY_CI_HITS2015_DIR}/templates/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_templates.py
            

            Show
            krzys Krzysztof Findeisen added a comment - - edited Steps to create Gen 3 test repo on lsst-dev : ingest_dataset.py --dataset CI-HiTS2015 --output hits2015_gen2 # Create a convertRepo_standalone.py that combines CI dataset's convertRepo_calibs and convertRepo_copied (and does not exclude raws). convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root `pwd`/hits2015_gen2/ingested/ --calibs `pwd`/hits2015_gen2/calibingested/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_standalone.py convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root ${AP_VERIFY_CI_HITS2015_DIR}/templates/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_templates.py
            Hide
            tjenness Tim Jenness added a comment -

            Note that the butler conversion tool is changing imminently. It will soon be `butler convert`.

            Show
            tjenness Tim Jenness added a comment - Note that the butler conversion tool is changing imminently. It will soon be `butler convert`.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Progress report: I've run a basic pipeline on HSC (using the DM-23992 branch), and run as far as is possible on DECam given DM-23983, DM-23985, and DM-23992. While I can't run DECam end-to-end, I can't find any bugs with the pipeline or ap_pipe itself.

            Remaining work is to add more safety checks to the pipeline file, and to document how to run in Gen 3.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Progress report: I've run a basic pipeline on HSC (using the DM-23992 branch), and run as far as is possible on DECam given DM-23983 , DM-23985 , and DM-23992 . While I can't run DECam end-to-end, I can't find any bugs with the pipeline or ap_pipe itself. Remaining work is to add more safety checks to the pipeline file, and to document how to run in Gen 3.
            Hide
            krzys Krzysztof Findeisen added a comment -

            Here are the updated steps to test on lsst-dev. This requires a copy of ap_verify_ci_cosmos_pdr2, the DM-23992 branch of meas_algorithms, and the attached convertRepo_standalone.py:

            # Create a Gen 3 dataset with raws, calibs, and templates
            ingest_dataset.py --dataset CI-CosmosPDR2 --output hsc_pdr2_gen2
            butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root `pwd`/hsc_pdr2_gen2/ingested/ --calibs `pwd`/hsc_pdr2_gen2/calibingested/ `pwd`/hsc_pdr2_gen3/ --config-file convertRepo_standalone.py
            butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root ${AP_VERIFY_CI_COSMOS_PDR2_DIR}/templates/ `pwd`/hsc_pdr2_gen3/ --config-file `pwd`/hsc_pdr2_gen2/config/convertRepo_templates.py
             
            # Pipeline
            make_apdb.py --config diaPipe.apdb.isolation_level=READ_UNCOMMITTED --config diaPipe.apdb.db_url="sqlite:///apdb.db"
            pipetask run -p ap_pipe/pipelines/ApPipe.yaml --instrument lsst.obs.subaru.HyperSuprimeCam --register-dataset-types --config diaPipe:apdb.isolation_level=READ_UNCOMMITTED --config diaPipe:apdb.db_url="sqlite:///apdb.db" --configfile calibrate:hsc_pdr2_gen2/config/calibrate.py --configfile differencer:hsc_pdr2_gen2/config/imageDifference.py --butler-config hsc_pdr2_gen3/ --input "templates/deep,skymaps,raw/HSC,calib/HSC,refcats" --output experimental
            

            Show
            krzys Krzysztof Findeisen added a comment - Here are the updated steps to test on lsst-dev . This requires a copy of ap_verify_ci_cosmos_pdr2 , the DM-23992 branch of meas_algorithms , and the attached convertRepo_standalone.py : # Create a Gen 3 dataset with raws, calibs, and templates ingest_dataset.py --dataset CI-CosmosPDR2 --output hsc_pdr2_gen2 butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root `pwd`/hsc_pdr2_gen2/ingested/ --calibs `pwd`/hsc_pdr2_gen2/calibingested/ `pwd`/hsc_pdr2_gen3/ --config-file convertRepo_standalone.py butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root ${AP_VERIFY_CI_COSMOS_PDR2_DIR}/templates/ `pwd`/hsc_pdr2_gen3/ --config-file `pwd`/hsc_pdr2_gen2/config/convertRepo_templates.py   # Pipeline make_apdb.py --config diaPipe.apdb.isolation_level=READ_UNCOMMITTED --config diaPipe.apdb.db_url="sqlite:///apdb.db" pipetask run -p ap_pipe/pipelines/ApPipe.yaml --instrument lsst.obs.subaru.HyperSuprimeCam --register-dataset-types --config diaPipe:apdb.isolation_level=READ_UNCOMMITTED --config diaPipe:apdb.db_url="sqlite:///apdb.db" --configfile calibrate:hsc_pdr2_gen2/config/calibrate.py --configfile differencer:hsc_pdr2_gen2/config/imageDifference.py --butler-config hsc_pdr2_gen3/ --input "templates/deep,skymaps,raw/HSC,calib/HSC,refcats" --output experimental
            Hide
            krzys Krzysztof Findeisen added a comment -

            I had to make some changes to packages other than ap_pipe to make everything work in Gen 3 (HSC only; see comments above). I'd appreciate it if you could each review the package of your expertise:

            For Meredith Rawls, a lot of the changes are documentation; I've attached a built version of the docs to this issue.

            Show
            krzys Krzysztof Findeisen added a comment - I had to make some changes to packages other than ap_pipe to make everything work in Gen 3 (HSC only; see comments above). I'd appreciate it if you could each review the package of your expertise: pipe_tasks – Gabor Kovacs ap_association – Chris Morrison ap_pipe – Meredith Rawls For Meredith Rawls , a lot of the changes are documentation; I've attached a built version of the docs to this issue.
            Hide
            cmorrison Chris Morrison added a comment -

            Hey Krzysztof, looks good to me. Glad to see the that you didn't have to change too much to get everything running in Gen3.

            Show
            cmorrison Chris Morrison added a comment - Hey Krzysztof, looks good to me. Glad to see the that you didn't have to change too much to get everything running in Gen3.
            Hide
            gkovacs Gabor Kovacs added a comment -

            Reviewed - I'm happy with the imageDifference.py changes. I have one (minor?) note:

             

            What about self.metadata ? What was the reason for the introduction of algMetadata ? As far as I see, self{{.metadata}} is defined in Task and persisted only under Gen2 in CmdLineTask but nothing happens under Gen3 (at least not in PipelineTask). Are we ready to throw away this information from image difference? I think it's mostly used related to kernelCandidateQA, that must have been a study on its own, so it may be phased out.{{}}

            Show
            gkovacs Gabor Kovacs added a comment - Reviewed - I'm happy with the imageDifference.py changes. I have one (minor?) note:   What about self.metadata ? What was the reason for the introduction of algMetadata ? As far as I see, self{{.metadata}} is defined in Task and persisted only under Gen2 in CmdLineTask but nothing happens under Gen3 (at least not in PipelineTask ). Are we ready to throw away this information from image difference? I think it's mostly used related to kernelCandidateQA, that must have been a study on its own, so it may be phased out.{{}}
            Hide
            mrawls Meredith Rawls added a comment -

            I left straightforward comments on GitHub for the ap_pipe docs, and pending those, it looks good. I'll take the liberty of marking this as "reviewed" for all three of us.

            One somewhat larger question - is the Gen 3 implementation done here kind of an intermediate step while we still need to support Gen 2? It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together. Maybe something to discuss at our weekly meeting if I'm off base. For now, I'm thrilled it works!

            Show
            mrawls Meredith Rawls added a comment - I left straightforward comments on GitHub for the ap_pipe docs, and pending those, it looks good. I'll take the liberty of marking this as "reviewed" for all three of us. One somewhat larger question - is the Gen 3 implementation done here kind of an intermediate step while we still need to support Gen 2? It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together. Maybe something to discuss at our weekly meeting if I'm off base. For now, I'm thrilled it works!
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            What about self.metadata? What was the reason for the introduction of algMetadata?

            I don't know anything about algMetadata, including whether or not it's related to metadata (though its attachment to source catalogs would make me think it's something different). Since Gen 3 tasks are supposed to be stateless, I assume metadata is going away at some point, but last I checked Middleware didn't have a plan for a replacement yet.

            Show
            krzys Krzysztof Findeisen added a comment - - edited What about self.metadata ? What was the reason for the introduction of algMetadata ? I don't know anything about algMetadata , including whether or not it's related to metadata (though its attachment to source catalogs would make me think it's something different). Since Gen 3 tasks are supposed to be stateless, I assume metadata is going away at some point, but last I checked Middleware didn't have a plan for a replacement yet.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together

            I think as long as we have ApPipeTask, it would be confusing for users to need different packages depending on whether they're running the pipeline in Gen 2 or Gen 3. So I'd be opposed to moving to pipe_tasks before we phase out ApPipeTask. We'd also need to do something about make_apdb or its replacement (see DM-22663); maybe put it in ap_association?

            Show
            krzys Krzysztof Findeisen added a comment - - edited It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together I think as long as we have ApPipeTask , it would be confusing for users to need different packages depending on whether they're running the pipeline in Gen 2 or Gen 3. So I'd be opposed to moving to pipe_tasks before we phase out ApPipeTask . We'd also need to do something about make_apdb or its replacement (see DM-22663 ); maybe put it in ap_association ?
            Hide
            tjenness Tim Jenness added a comment -

            What is this metadata that is being discussed?

            Show
            tjenness Tim Jenness added a comment - What is this metadata that is being discussed?
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            One is the standard Task.metadata. The other is ImageDifferenceTask.algMetadata, which seems to be associated with measurement algorithms in some way.

            Show
            krzys Krzysztof Findeisen added a comment - - edited One is the standard Task.metadata . The other is ImageDifferenceTask.algMetadata , which seems to be associated with measurement algorithms in some way.

              People

              • Assignee:
                krzys Krzysztof Findeisen
                Reporter:
                krzys Krzysztof Findeisen
                Reviewers:
                Chris Morrison, Gabor Kovacs, Meredith Rawls
                Watchers:
                Chris Morrison, Gabor Kovacs, John Swinbank, Krzysztof Findeisen, Meredith Rawls, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel