Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24262

Run HSC AP processing in CI using Gen 3

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ap_verify, jenkins
    • Labels:
      None
    • Story Points:
      8
    • Sprint:
      AP S20-6 (May), AP F20-1 (June), AP F20-2 (July), AP F20-3 (August)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      Add an HSC Gen 3 dataset to the scipipe/ap_verify Jenkins job. The job should already be designed to let us plug in more datasets, so the main work would be specifying a Gen 3 run (requires adding an internal flag to the Jenkins job?) and working with Groovy.

        Attachments

          Issue Links

            Activity

            swinbank John Swinbank created issue -
            swinbank John Swinbank made changes -
            Field Original Value New Value
            Epic Link DM-22633 [ 427742 ]
            swinbank John Swinbank made changes -
            Link This issue is blocked by DM-21919 [ DM-21919 ]
            swinbank John Swinbank made changes -
            Epic Link DM-22633 [ 427742 ] DM-24341 [ 433028 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue is blocked by DM-24260 [ DM-24260 ]
            krzys Krzysztof Findeisen made changes -
            Description Add an HSC Gen 3 dataset to the {{scipipe/ap_verify}} Jenkins job. The job should already be designed to let us plug in more datasets, so the main work would be specifying a Gen 3 run (requires adding an internal flag to the Jenkins job?) and working with Groovy.
            swinbank John Swinbank made changes -
            Epic Link DM-24341 [ 433028 ] DM-25145 [ 435263 ]
            swinbank John Swinbank made changes -
            Sprint AP S20-6 (May) [ 987 ] AP S20-6 (May), AP F20-1 (June) [ 987, 1019 ]
            Hide
            swinbank John Swinbank added a comment -

            As of 2020-07-02, we're a little bit unclear about what work is actually required to get this done; part of the scope of this ticket is understanding that, then estimating story points appropriately.

            Show
            swinbank John Swinbank added a comment - As of 2020-07-02, we're a little bit unclear about what work is actually required to get this done; part of the scope of this ticket is understanding that, then estimating story points appropriately.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Adding an 8 SP guess just to make the bookkeeping a bit easier (since Jira's sprint planner counts "no estimate" as 0 SP). Will revise once I understand what needs to be done.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Adding an 8 SP guess just to make the bookkeeping a bit easier (since Jira's sprint planner counts "no estimate" as 0 SP). Will revise once I understand what needs to be done.
            krzys Krzysztof Findeisen made changes -
            Story Points 8
            swinbank John Swinbank made changes -
            Sprint AP S20-6 (May), AP F20-1 (June) [ 987, 1019 ] AP S20-6 (May), AP F20-1 (June), AP F20-2 (July) [ 987, 1019, 1025 ]
            krzys Krzysztof Findeisen made changes -
            Link This issue relates to DM-21916 [ DM-21916 ]
            swinbank John Swinbank made changes -
            Sprint AP S20-6 (May), AP F20-1 (June), AP F20-2 (July) [ 987, 1019, 1025 ] AP S20-6 (May), AP F20-1 (June), AP F20-2 (July), AP F20-3 (August) [ 987, 1019, 1025, 1033 ]
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            This doesn't look too bad (famous last words...). The code for the CI run is in ap_verify.groovy and run_ci_dataset.sh; the definitions of the individual runs are in ap_verify.yaml.

            What needs to be done:

            1. Add a line to ap_verify.yaml that specifies the existing run is Gen 2.
            2. It should probably be illegal to set up a run that doesn't declare itself either Gen 2 or Gen 3; add input validation to ap_verify.groovy.
            3. Add a branch to the SQuaSH upload code in ap_verify.groovy that checks the generation. The Gen 2 branch should have the current call to runDispatchVerify; the Gen 3 branch should be left empty, pending DM-21916.
            4. Add a CLI flag to run_ci_dataset.sh that specifies whether to do Gen 2 or Gen 3 processing.
            5. Add code to ap_verify.groovy that sets the flag on run_ci_dataset.sh based on ap_verify.yaml, analogous to how the input dataset is handled.
            6. Add new entries for running CI-CosmosPDR2 in both Gen 2 and Gen 3 to ap_verify.yaml.

            I think 8 SP is an appropriate estimate for this work after including a margin of error. I expect the main difficulty to be unfamiliarity with Groovy rather than anything related to Gen 3 itself.

            Show
            krzys Krzysztof Findeisen added a comment - - edited This doesn't look too bad (famous last words...). The code for the CI run is in ap_verify.groovy and run_ci_dataset.sh ; the definitions of the individual runs are in ap_verify.yaml . What needs to be done: Add a line to ap_verify.yaml that specifies the existing run is Gen 2. It should probably be illegal to set up a run that doesn't declare itself either Gen 2 or Gen 3; add input validation to ap_verify.groovy . Add a branch to the SQuaSH upload code in ap_verify.groovy that checks the generation. The Gen 2 branch should have the current call to runDispatchVerify ; the Gen 3 branch should be left empty, pending DM-21916 . Add a CLI flag to run_ci_dataset.sh that specifies whether to do Gen 2 or Gen 3 processing. Add code to ap_verify.groovy that sets the flag on run_ci_dataset.sh based on ap_verify.yaml , analogous to how the input dataset is handled. Add new entries for running CI-CosmosPDR2 in both Gen 2 and Gen 3 to ap_verify.yaml . I think 8 SP is an appropriate estimate for this work after including a margin of error. I expect the main difficulty to be unfamiliarity with Groovy rather than anything related to Gen 3 itself.
            krzys Krzysztof Findeisen made changes -
            Watchers John Swinbank, Krzysztof Findeisen [ John Swinbank, Krzysztof Findeisen ] John Swinbank, Kian-Tat Lim, Krzysztof Findeisen [ John Swinbank, Kian-Tat Lim, Krzysztof Findeisen ]
            krzys Krzysztof Findeisen made changes -
            Component/s ap_verify [ 14167 ]
            Component/s jenkins [ 17815 ]
            krzys Krzysztof Findeisen made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Hi Kian-Tat Lim, would you be willing to review this? I don't know anybody else who can review pipeline scripts.

            I've gone around the usual DM procedure by putting this ticket into review before trying to test it on Jenkins (since doing so will likely produce lots of spam if something goes wrong). I'm not sure how to handle the testing procedure itself; would I have to merge what I have to master, then merge any fixes on top of that from the same branch, without rebasing?

            Show
            krzys Krzysztof Findeisen added a comment - - edited Hi Kian-Tat Lim , would you be willing to review this? I don't know anybody else who can review pipeline scripts. I've gone around the usual DM procedure by putting this ticket into review before trying to test it on Jenkins (since doing so will likely produce lots of spam if something goes wrong). I'm not sure how to handle the testing procedure itself; would I have to merge what I have to master , then merge any fixes on top of that from the same branch, without rebasing?
            krzys Krzysztof Findeisen made changes -
            Reviewers Kian-Tat Lim [ ktl ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            Looks good. The matrix will add some more jobs to the release postprocessing, but they're pretty fast.

            Show
            ktl Kian-Tat Lim added a comment - Looks good. The matrix will add some more jobs to the release postprocessing, but they're pretty fast.
            ktl Kian-Tat Lim made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            In terms of merging, you'll note that there are sometimes some very short-lived ticket branches of the form DM-NNNNNa, DM-NNNNNb, etc. in the jenkins-dm-jobs history.

            Show
            ktl Kian-Tat Lim added a comment - In terms of merging, you'll note that there are sometimes some very short-lived ticket branches of the form DM-NNNNNa, DM-NNNNNb, etc. in the jenkins-dm-jobs history.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Progress report: as of tickets/DM-24262c, I have both CI-HiTS2015 and CI-CosmosPDR2 working in Gen 2. Gen 3 crashes because it's missing a change to ap_pipe that was merged in DM-21919 today, so I won't be able to finish testing until the next daily. Sorry for not realizing that that would be a problem!

            I have also tested the new Gen 2 upload, and it seems to work fine. The only problem is that our SQuaSH dashboards mostly assume ccdnum, which is DECam-specific. I think it should be possible to generalize our queries enough to handle both...

            Show
            krzys Krzysztof Findeisen added a comment - - edited Progress report: as of tickets/ DM-24262 c , I have both CI-HiTS2015 and CI-CosmosPDR2 working in Gen 2. Gen 3 crashes because it's missing a change to ap_pipe that was merged in DM-21919 today, so I won't be able to finish testing until the next daily. Sorry for not realizing that that would be a problem! I have also tested the new Gen 2 upload, and it seems to work fine. The only problem is that our SQuaSH dashboards mostly assume ccdnum , which is DECam-specific. I think it should be possible to generalize our queries enough to handle both...
            Hide
            krzys Krzysztof Findeisen added a comment -

            The Gen 3 pipeline ran automatically last night, metric files were archived, and the log doesn't show anything fishy. Jenkins made no attempt to upload the Gen 3 results to SQuaSH, as expected given DM-21916. I think that means it works.

            Show
            krzys Krzysztof Findeisen added a comment - The Gen 3 pipeline ran automatically last night, metric files were archived, and the log doesn't show anything fishy. Jenkins made no attempt to upload the Gen 3 results to SQuaSH, as expected given DM-21916 . I think that means it works.
            krzys Krzysztof Findeisen made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]

              People

              • Assignee:
                krzys Krzysztof Findeisen
                Reporter:
                swinbank John Swinbank
                Reviewers:
                Kian-Tat Lim
                Watchers:
                John Swinbank, Kian-Tat Lim, Krzysztof Findeisen
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel