Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24262

Run HSC AP processing in CI using Gen 3

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ap_verify, jenkins
    • Labels:
      None
    • Story Points:
      8
    • Sprint:
      AP S20-6 (May), AP F20-1 (June), AP F20-2 (July), AP F20-3 (August)
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      Add an HSC Gen 3 dataset to the scipipe/ap_verify Jenkins job. The job should already be designed to let us plug in more datasets, so the main work would be specifying a Gen 3 run (requires adding an internal flag to the Jenkins job?) and working with Groovy.

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            As of 2020-07-02, we're a little bit unclear about what work is actually required to get this done; part of the scope of this ticket is understanding that, then estimating story points appropriately.

            Show
            swinbank John Swinbank added a comment - As of 2020-07-02, we're a little bit unclear about what work is actually required to get this done; part of the scope of this ticket is understanding that, then estimating story points appropriately.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Adding an 8 SP guess just to make the bookkeeping a bit easier (since Jira's sprint planner counts "no estimate" as 0 SP). Will revise once I understand what needs to be done.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Adding an 8 SP guess just to make the bookkeeping a bit easier (since Jira's sprint planner counts "no estimate" as 0 SP). Will revise once I understand what needs to be done.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            This doesn't look too bad (famous last words...). The code for the CI run is in ap_verify.groovy and run_ci_dataset.sh; the definitions of the individual runs are in ap_verify.yaml.

            What needs to be done:

            1. Add a line to ap_verify.yaml that specifies the existing run is Gen 2.
            2. It should probably be illegal to set up a run that doesn't declare itself either Gen 2 or Gen 3; add input validation to ap_verify.groovy.
            3. Add a branch to the SQuaSH upload code in ap_verify.groovy that checks the generation. The Gen 2 branch should have the current call to runDispatchVerify; the Gen 3 branch should be left empty, pending DM-21916.
            4. Add a CLI flag to run_ci_dataset.sh that specifies whether to do Gen 2 or Gen 3 processing.
            5. Add code to ap_verify.groovy that sets the flag on run_ci_dataset.sh based on ap_verify.yaml, analogous to how the input dataset is handled.
            6. Add new entries for running CI-CosmosPDR2 in both Gen 2 and Gen 3 to ap_verify.yaml.

            I think 8 SP is an appropriate estimate for this work after including a margin of error. I expect the main difficulty to be unfamiliarity with Groovy rather than anything related to Gen 3 itself.

            Show
            krzys Krzysztof Findeisen added a comment - - edited This doesn't look too bad (famous last words...). The code for the CI run is in ap_verify.groovy and run_ci_dataset.sh ; the definitions of the individual runs are in ap_verify.yaml . What needs to be done: Add a line to ap_verify.yaml that specifies the existing run is Gen 2. It should probably be illegal to set up a run that doesn't declare itself either Gen 2 or Gen 3; add input validation to ap_verify.groovy . Add a branch to the SQuaSH upload code in ap_verify.groovy that checks the generation. The Gen 2 branch should have the current call to runDispatchVerify ; the Gen 3 branch should be left empty, pending DM-21916 . Add a CLI flag to run_ci_dataset.sh that specifies whether to do Gen 2 or Gen 3 processing. Add code to ap_verify.groovy that sets the flag on run_ci_dataset.sh based on ap_verify.yaml , analogous to how the input dataset is handled. Add new entries for running CI-CosmosPDR2 in both Gen 2 and Gen 3 to ap_verify.yaml . I think 8 SP is an appropriate estimate for this work after including a margin of error. I expect the main difficulty to be unfamiliarity with Groovy rather than anything related to Gen 3 itself.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Hi Kian-Tat Lim, would you be willing to review this? I don't know anybody else who can review pipeline scripts.

            I've gone around the usual DM procedure by putting this ticket into review before trying to test it on Jenkins (since doing so will likely produce lots of spam if something goes wrong). I'm not sure how to handle the testing procedure itself; would I have to merge what I have to master, then merge any fixes on top of that from the same branch, without rebasing?

            Show
            krzys Krzysztof Findeisen added a comment - - edited Hi Kian-Tat Lim , would you be willing to review this? I don't know anybody else who can review pipeline scripts. I've gone around the usual DM procedure by putting this ticket into review before trying to test it on Jenkins (since doing so will likely produce lots of spam if something goes wrong). I'm not sure how to handle the testing procedure itself; would I have to merge what I have to master , then merge any fixes on top of that from the same branch, without rebasing?
            Hide
            ktl Kian-Tat Lim added a comment -

            Looks good. The matrix will add some more jobs to the release postprocessing, but they're pretty fast.

            Show
            ktl Kian-Tat Lim added a comment - Looks good. The matrix will add some more jobs to the release postprocessing, but they're pretty fast.
            Hide
            ktl Kian-Tat Lim added a comment -

            In terms of merging, you'll note that there are sometimes some very short-lived ticket branches of the form DM-NNNNNa, DM-NNNNNb, etc. in the jenkins-dm-jobs history.

            Show
            ktl Kian-Tat Lim added a comment - In terms of merging, you'll note that there are sometimes some very short-lived ticket branches of the form DM-NNNNNa, DM-NNNNNb, etc. in the jenkins-dm-jobs history.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Progress report: as of tickets/DM-24262c, I have both CI-HiTS2015 and CI-CosmosPDR2 working in Gen 2. Gen 3 crashes because it's missing a change to ap_pipe that was merged in DM-21919 today, so I won't be able to finish testing until the next daily. Sorry for not realizing that that would be a problem!

            I have also tested the new Gen 2 upload, and it seems to work fine. The only problem is that our SQuaSH dashboards mostly assume ccdnum, which is DECam-specific. I think it should be possible to generalize our queries enough to handle both...

            Show
            krzys Krzysztof Findeisen added a comment - - edited Progress report: as of tickets/ DM-24262 c , I have both CI-HiTS2015 and CI-CosmosPDR2 working in Gen 2. Gen 3 crashes because it's missing a change to ap_pipe that was merged in DM-21919 today, so I won't be able to finish testing until the next daily. Sorry for not realizing that that would be a problem! I have also tested the new Gen 2 upload, and it seems to work fine. The only problem is that our SQuaSH dashboards mostly assume ccdnum , which is DECam-specific. I think it should be possible to generalize our queries enough to handle both...
            Hide
            krzys Krzysztof Findeisen added a comment -

            The Gen 3 pipeline ran automatically last night, metric files were archived, and the log doesn't show anything fishy. Jenkins made no attempt to upload the Gen 3 results to SQuaSH, as expected given DM-21916. I think that means it works.

            Show
            krzys Krzysztof Findeisen added a comment - The Gen 3 pipeline ran automatically last night, metric files were archived, and the log doesn't show anything fishy. Jenkins made no attempt to upload the Gen 3 results to SQuaSH, as expected given DM-21916 . I think that means it works.

              People

              Assignee:
              krzys Krzysztof Findeisen
              Reporter:
              swinbank John Swinbank
              Reviewers:
              Kian-Tat Lim
              Watchers:
              John Swinbank, Kian-Tat Lim, Krzysztof Findeisen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.