Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-21939

Create Gen 3 AP Pipeline

    Details

    • Story Points:
      4
    • Sprint:
      AP F19-6 (November), AP S20-6 (May), AP F20-1 (June)
    • Team:
      Alert Production

      Description

      This is a mostly-umbrella ticket for getting all current Gen2 ap_pipe functionality working in Gen3. The goal is being able to run AP on HiTS 2015 using the CmdLineActivator. It includes:

      • creating scripts that ingest HiTS test data into a Gen3 repository (DM-21862)
      • conversion of all remaining ApPipeTask subtasks to Gen 3 (DM-21874, DM-21886)
      • creation of a YAML file configuring the pipeline (1 SP on this ticket)

      ApPipeTask has a lot of code for handling the case of calexp difference imaging templates. By general agreement within the AP and Middleware groups, we will not be porting this functionality to Gen 3 (see also DM-21874), and will reimplement it from scratch should we need it after migration.

        Attachments

          Issue Links

            Activity

            krzys Krzysztof Findeisen created issue -
            krzys Krzysztof Findeisen made changes -
            Field Original Value New Value
            Link This issue blocks DM-21888 [ DM-21888 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue blocks DM-21919 [ DM-21919 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-21862 [ DM-21862 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-21886 [ DM-21886 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-21874 [ DM-21874 ]
            Hide
            krzys Krzysztof Findeisen added a comment -

            DM-21915 would be of interest to people trying to use ap_pipe in Gen 3, but is not actually necessary given DM-21862.

            Show
            krzys Krzysztof Findeisen added a comment - DM-21915 would be of interest to people trying to use ap_pipe in Gen 3, but is not actually necessary given DM-21862 .
            krzys Krzysztof Findeisen made changes -
            Link This issue relates to DM-21915 [ DM-21915 ]
            krzys Krzysztof Findeisen made changes -
            Labels gen2-deprecation-blocker gen3-middleware
            krzys Krzysztof Findeisen made changes -
            Link This issue blocks DM-21888 [ DM-21888 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue blocks DM-21888 [ DM-21888 ]
            krzys Krzysztof Findeisen made changes -
            Description This is a mostly-umbrella ticket for getting all current Gen2 ap_pipe functionality working in Gen3. The goal is being able to run AP on HiTS 2015 using the "laptop" executor. It includes:
            * creating scripts that ingest HiTS test data into a Gen3 repository (DM-21862)
            * conversion of all remaining {{ApPipeTask}} subtasks to Gen 3 (DM-21874, DM-21886)
            * creation of a YAML file configuring the pipeline (1 SP on this ticket)

            {{ApPipeTask}} has a lot of code for handling the case of calexp difference imaging templates. By general agreement within the AP and Middleware groups, we will *not* be porting this functionality to Gen 3 (see also DM-21874), and will reimplement it from scratch should we need it after migration.
            This is a mostly-umbrella ticket for getting all current Gen2 ap_pipe functionality working in Gen3. The goal is being able to run AP on HiTS 2015 using the {{CmdLineActivator}}. It includes:
            * creating scripts that ingest HiTS test data into a Gen3 repository (DM-21862)
            * conversion of all remaining {{ApPipeTask}} subtasks to Gen 3 (DM-21874, DM-21886)
            * creation of a YAML file configuring the pipeline (1 SP on this ticket)

            {{ApPipeTask}} has a lot of code for handling the case of calexp difference imaging templates. By general agreement within the AP and Middleware groups, we will *not* be porting this functionality to Gen 3 (see also DM-21874), and will reimplement it from scratch should we need it after migration.
            ebellm Eric Bellm made changes -
            Assignee Meredith Rawls [ mrawls ]
            ebellm Eric Bellm made changes -
            Epic Link DM-21442 [ 423049 ]
            ebellm Eric Bellm made changes -
            Sprint AP F19-6 (November) [ 958 ]
            ebellm Eric Bellm made changes -
            Rank Ranked higher
            swinbank John Swinbank made changes -
            Sprint AP F19-6 (November) [ 958 ] AP F19-6 (November), AP S20-1 (December) [ 958, 981 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue relates to DM-22599 [ DM-22599 ]
            swinbank John Swinbank made changes -
            Epic Link DM-21442 [ 423049 ] DM-22633 [ 427742 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue relates to DM-22663 [ DM-22663 ]
            cmorrison Chris Morrison made changes -
            Link This issue is blocked by DM-22741 [ DM-22741 ]
            swinbank John Swinbank made changes -
            Sprint AP F19-6 (November), AP S20-1 (December) [ 958, 981 ] AP F19-6 (November) [ 958 ]
            swinbank John Swinbank made changes -
            Link This issue is blocked by DM-24260 [ DM-24260 ]
            Hide
            swinbank John Swinbank added a comment -

            This ticket is cast in terms of support for obs_decam, and hence is blocked by DM-21862. However, an equally-good success criterion would be to get the pipeline working with HSC, and hence DM-24260 is also a blocker. Either one of those would be fine!

            Show
            swinbank John Swinbank added a comment - This ticket is cast in terms of support for obs_decam, and hence is blocked by DM-21862 . However, an equally-good success criterion would be to get the pipeline working with HSC, and hence DM-24260 is also a blocker. Either one of those would be fine!
            swinbank John Swinbank made changes -
            Assignee Meredith Rawls [ mrawls ] Krzysztof Findeisen [ krzys ]
            swinbank John Swinbank made changes -
            Sprint AP F19-6 (November) [ 958 ] AP F19-6 (November), AP S20-5 (April) [ 958, 986 ]
            swinbank John Swinbank made changes -
            Story Points 1 4
            swinbank John Swinbank made changes -
            Sprint AP F19-6 (November), AP S20-5 (April) [ 958, 986 ] AP F19-6 (November), AP S20-6 (May) [ 958, 987 ]
            swinbank John Swinbank made changes -
            Rank Ranked lower
            swinbank John Swinbank made changes -
            Epic Link DM-22633 [ 427742 ] DM-24341 [ 433028 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue relates to DM-12549 [ DM-12549 ]
            krzys Krzysztof Findeisen made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Steps to create Gen 3 test repo on lsst-dev:

            ingest_dataset.py --dataset CI-HiTS2015 --output hits2015_gen2
            # Create a convertRepo_standalone.py that combines CI dataset's convertRepo_calibs and convertRepo_copied (and does not exclude raws).
            convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root `pwd`/hits2015_gen2/ingested/ --calibs `pwd`/hits2015_gen2/calibingested/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_standalone.py
            convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root ${AP_VERIFY_CI_HITS2015_DIR}/templates/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_templates.py
            

            Show
            krzys Krzysztof Findeisen added a comment - - edited Steps to create Gen 3 test repo on lsst-dev : ingest_dataset.py --dataset CI-HiTS2015 --output hits2015_gen2 # Create a convertRepo_standalone.py that combines CI dataset's convertRepo_calibs and convertRepo_copied (and does not exclude raws). convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root `pwd`/hits2015_gen2/ingested/ --calibs `pwd`/hits2015_gen2/calibingested/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_standalone.py convert_gen2_repo_to_gen3.py lsst.obs.decam.DarkEnergyCamera --gen2root ${AP_VERIFY_CI_HITS2015_DIR}/templates/ --gen3root `pwd`/hits2015_gen3/ --config `pwd`/hits2015_gen2/config/convertRepo_templates.py
            Hide
            tjenness Tim Jenness added a comment -

            Note that the butler conversion tool is changing imminently. It will soon be `butler convert`.

            Show
            tjenness Tim Jenness added a comment - Note that the butler conversion tool is changing imminently. It will soon be `butler convert`.
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-25014 [ DM-25014 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-23992 [ DM-23992 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-25040 [ DM-25040 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue blocks DM-23983 [ DM-23983 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue blocks DM-23983 [ DM-23983 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-23983 [ DM-23983 ]
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Progress report: I've run a basic pipeline on HSC (using the DM-23992 branch), and run as far as is possible on DECam given DM-23983, DM-23985, and DM-23992. While I can't run DECam end-to-end, I can't find any bugs with the pipeline or ap_pipe itself.

            Remaining work is to add more safety checks to the pipeline file, and to document how to run in Gen 3.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Progress report: I've run a basic pipeline on HSC (using the DM-23992 branch), and run as far as is possible on DECam given DM-23983 , DM-23985 , and DM-23992 . While I can't run DECam end-to-end, I can't find any bugs with the pipeline or ap_pipe itself. Remaining work is to add more safety checks to the pipeline file, and to document how to run in Gen 3.
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-23983 [ DM-23983 ]
            swinbank John Swinbank made changes -
            Epic Link DM-24341 [ 433028 ] DM-25145 [ 435263 ]
            swinbank John Swinbank made changes -
            Sprint AP F19-6 (November), AP S20-6 (May) [ 958, 987 ] AP F19-6 (November), AP S20-6 (May), AP F20-1 (June) [ 958, 987, 1019 ]
            krzys Krzysztof Findeisen made changes -
            Attachment convertRepo_standalone.py [ 44305 ]
            Hide
            krzys Krzysztof Findeisen added a comment -

            Here are the updated steps to test on lsst-dev. This requires a copy of ap_verify_ci_cosmos_pdr2, the DM-23992 branch of meas_algorithms, and the attached convertRepo_standalone.py:

            # Create a Gen 3 dataset with raws, calibs, and templates
            ingest_dataset.py --dataset CI-CosmosPDR2 --output hsc_pdr2_gen2
            butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root `pwd`/hsc_pdr2_gen2/ingested/ --calibs `pwd`/hsc_pdr2_gen2/calibingested/ `pwd`/hsc_pdr2_gen3/ --config-file convertRepo_standalone.py
            butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root ${AP_VERIFY_CI_COSMOS_PDR2_DIR}/templates/ `pwd`/hsc_pdr2_gen3/ --config-file `pwd`/hsc_pdr2_gen2/config/convertRepo_templates.py
             
            # Pipeline
            make_apdb.py --config diaPipe.apdb.isolation_level=READ_UNCOMMITTED --config diaPipe.apdb.db_url="sqlite:///apdb.db"
            pipetask run -p ap_pipe/pipelines/ApPipe.yaml --instrument lsst.obs.subaru.HyperSuprimeCam --register-dataset-types --config diaPipe:apdb.isolation_level=READ_UNCOMMITTED --config diaPipe:apdb.db_url="sqlite:///apdb.db" --configfile calibrate:hsc_pdr2_gen2/config/calibrate.py --configfile differencer:hsc_pdr2_gen2/config/imageDifference.py --butler-config hsc_pdr2_gen3/ --input "templates/deep,skymaps,raw/HSC,calib/HSC,refcats" --output experimental
            

            Show
            krzys Krzysztof Findeisen added a comment - Here are the updated steps to test on lsst-dev . This requires a copy of ap_verify_ci_cosmos_pdr2 , the DM-23992 branch of meas_algorithms , and the attached convertRepo_standalone.py : # Create a Gen 3 dataset with raws, calibs, and templates ingest_dataset.py --dataset CI-CosmosPDR2 --output hsc_pdr2_gen2 butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root `pwd`/hsc_pdr2_gen2/ingested/ --calibs `pwd`/hsc_pdr2_gen2/calibingested/ `pwd`/hsc_pdr2_gen3/ --config-file convertRepo_standalone.py butler convert --instrument lsst.obs.subaru.HyperSuprimeCam --gen2root ${AP_VERIFY_CI_COSMOS_PDR2_DIR}/templates/ `pwd`/hsc_pdr2_gen3/ --config-file `pwd`/hsc_pdr2_gen2/config/convertRepo_templates.py   # Pipeline make_apdb.py --config diaPipe.apdb.isolation_level=READ_UNCOMMITTED --config diaPipe.apdb.db_url="sqlite:///apdb.db" pipetask run -p ap_pipe/pipelines/ApPipe.yaml --instrument lsst.obs.subaru.HyperSuprimeCam --register-dataset-types --config diaPipe:apdb.isolation_level=READ_UNCOMMITTED --config diaPipe:apdb.db_url="sqlite:///apdb.db" --configfile calibrate:hsc_pdr2_gen2/config/calibrate.py --configfile differencer:hsc_pdr2_gen2/config/imageDifference.py --butler-config hsc_pdr2_gen3/ --input "templates/deep,skymaps,raw/HSC,calib/HSC,refcats" --output experimental
            krzys Krzysztof Findeisen made changes -
            Attachment ap_pipe_docs.tar.gz [ 44307 ]
            Hide
            krzys Krzysztof Findeisen added a comment -

            I had to make some changes to packages other than ap_pipe to make everything work in Gen 3 (HSC only; see comments above). I'd appreciate it if you could each review the package of your expertise:

            For Meredith Rawls, a lot of the changes are documentation; I've attached a built version of the docs to this issue.

            Show
            krzys Krzysztof Findeisen added a comment - I had to make some changes to packages other than ap_pipe to make everything work in Gen 3 (HSC only; see comments above). I'd appreciate it if you could each review the package of your expertise: pipe_tasks – Gabor Kovacs ap_association – Chris Morrison ap_pipe – Meredith Rawls For Meredith Rawls , a lot of the changes are documentation; I've attached a built version of the docs to this issue.
            krzys Krzysztof Findeisen made changes -
            Reviewers Chris Morrison, Gabor Kovacs, Meredith Rawls [ cmorrison, gkovacs, mrawls ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            cmorrison Chris Morrison added a comment -

            Hey Krzysztof, looks good to me. Glad to see the that you didn't have to change too much to get everything running in Gen3.

            Show
            cmorrison Chris Morrison added a comment - Hey Krzysztof, looks good to me. Glad to see the that you didn't have to change too much to get everything running in Gen3.
            Hide
            gkovacs Gabor Kovacs added a comment -

            Reviewed - I'm happy with the imageDifference.py changes. I have one (minor?) note:

             

            What about self.metadata ? What was the reason for the introduction of algMetadata ? As far as I see, self{{.metadata}} is defined in Task and persisted only under Gen2 in CmdLineTask but nothing happens under Gen3 (at least not in PipelineTask). Are we ready to throw away this information from image difference? I think it's mostly used related to kernelCandidateQA, that must have been a study on its own, so it may be phased out.{{}}

            Show
            gkovacs Gabor Kovacs added a comment - Reviewed - I'm happy with the imageDifference.py changes. I have one (minor?) note:   What about self.metadata ? What was the reason for the introduction of algMetadata ? As far as I see, self{{.metadata}} is defined in Task and persisted only under Gen2 in CmdLineTask but nothing happens under Gen3 (at least not in PipelineTask ). Are we ready to throw away this information from image difference? I think it's mostly used related to kernelCandidateQA, that must have been a study on its own, so it may be phased out.{{}}
            Hide
            mrawls Meredith Rawls added a comment -

            I left straightforward comments on GitHub for the ap_pipe docs, and pending those, it looks good. I'll take the liberty of marking this as "reviewed" for all three of us.

            One somewhat larger question - is the Gen 3 implementation done here kind of an intermediate step while we still need to support Gen 2? It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together. Maybe something to discuss at our weekly meeting if I'm off base. For now, I'm thrilled it works!

            Show
            mrawls Meredith Rawls added a comment - I left straightforward comments on GitHub for the ap_pipe docs, and pending those, it looks good. I'll take the liberty of marking this as "reviewed" for all three of us. One somewhat larger question - is the Gen 3 implementation done here kind of an intermediate step while we still need to support Gen 2? It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together. Maybe something to discuss at our weekly meeting if I'm off base. For now, I'm thrilled it works!
            mrawls Meredith Rawls made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            What about self.metadata? What was the reason for the introduction of algMetadata?

            I don't know anything about algMetadata, including whether or not it's related to metadata (though its attachment to source catalogs would make me think it's something different). Since Gen 3 tasks are supposed to be stateless, I assume metadata is going away at some point, but last I checked Middleware didn't have a plan for a replacement yet.

            Show
            krzys Krzysztof Findeisen added a comment - - edited What about self.metadata ? What was the reason for the introduction of algMetadata ? I don't know anything about algMetadata , including whether or not it's related to metadata (though its attachment to source catalogs would make me think it's something different). Since Gen 3 tasks are supposed to be stateless, I assume metadata is going away at some point, but last I checked Middleware didn't have a plan for a replacement yet.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together

            I think as long as we have ApPipeTask, it would be confusing for users to need different packages depending on whether they're running the pipeline in Gen 2 or Gen 3. So I'd be opposed to moving to pipe_tasks before we phase out ApPipeTask. We'd also need to do something about make_apdb or its replacement (see DM-22663); maybe put it in ap_association?

            Show
            krzys Krzysztof Findeisen added a comment - - edited It's my vague understanding that eventually, running the AP Pipeline will just be having a yaml file with "do ISR, do characterization, do calibration, do differencing, do association, the end" and we won't need an ap_pipe package to link those steps together I think as long as we have ApPipeTask , it would be confusing for users to need different packages depending on whether they're running the pipeline in Gen 2 or Gen 3. So I'd be opposed to moving to pipe_tasks before we phase out ApPipeTask . We'd also need to do something about make_apdb or its replacement (see DM-22663 ); maybe put it in ap_association ?
            Hide
            tjenness Tim Jenness added a comment -

            What is this metadata that is being discussed?

            Show
            tjenness Tim Jenness added a comment - What is this metadata that is being discussed?
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            One is the standard Task.metadata. The other is ImageDifferenceTask.algMetadata, which seems to be associated with measurement algorithms in some way.

            Show
            krzys Krzysztof Findeisen added a comment - - edited One is the standard Task.metadata . The other is ImageDifferenceTask.algMetadata , which seems to be associated with measurement algorithms in some way.
            krzys Krzysztof Findeisen made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]

              People

              • Assignee:
                krzys Krzysztof Findeisen
                Reporter:
                krzys Krzysztof Findeisen
                Reviewers:
                Chris Morrison, Gabor Kovacs, Meredith Rawls
                Watchers:
                Chris Morrison, Gabor Kovacs, John Swinbank, Krzysztof Findeisen, Meredith Rawls, Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel