Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-15983

Create a Jenkins job for ap_verify

    Details

      Description

      ap_verify will be most useful for integration testing if it can be run as a nightly job in Jenkins. Please create a job that exercises ap_verify on the ap_verify_ci_hits2015 dataset. A solution that can be easily extended to additional datasets would be preferred.

      Once the HiTS dataset is downloaded and set up, it can be run using run_ci_dataset.sh -d CI-HiTS2015.

        Attachments

          Issue Links

            Activity

            Hide
            jhoblitt Joshua Hoblitt added a comment -

            I've merged a new scipipe/ap_verify job to production to catch the upcoming weekly.

            https://ci.lsst.codes/blue/organizations/jenkins/scipipe%2Fap_verify/activity

            This job takes the name of a docker image as input. Eg., lsstsqre/centos:7-stack-lsst_distrib-d_2018_10_04 which is assumed to have the ap_verify eups product installed as current. The dataset(s) to run are defined in this yaml file:

            https://github.com/lsst-sqre/jenkins-dm-jobs/blob/master/etc/scipipe/ap_verify.yaml

            The format is hopefully self explanatory. At present, the master branch of lsst/ap_verify_ci_hits2015 is the only configured dataset but this job has been tested as working with multiple branches. Eg.

            ---
            ap_verify:
              datasets:
                - name: CI-HiTS2015
                  github_repo: lsst/ap_verify_ci_hits2015
                  git_ref: master
                  clone_timelimit: 15
                  retries: 3
                  run_timelimit: 60
                - name: CI-HiTS2015
                  github_repo: lsst/ap_verify_ci_hits2015
                  git_ref: tickets/DM-15872
                  clone_timelimit: 15
                  retries: 1
                  run_timelimit: 60
            

            Any files output with the extensions .log and .json are collected as artifacts. A build should be triggered by the daily, weekly, and official docker image builds.

            I'm leaving this ticket as "in progress" to follow up after the weekly has run and to collect end-user feedback.

            Show
            jhoblitt Joshua Hoblitt added a comment - I've merged a new scipipe/ap_verify job to production to catch the upcoming weekly. https://ci.lsst.codes/blue/organizations/jenkins/scipipe%2Fap_verify/activity This job takes the name of a docker image as input. Eg., lsstsqre/centos:7-stack-lsst_distrib-d_2018_10_04 which is assumed to have the ap_verify eups product installed as current . The dataset(s) to run are defined in this yaml file: https://github.com/lsst-sqre/jenkins-dm-jobs/blob/master/etc/scipipe/ap_verify.yaml The format is hopefully self explanatory. At present, the master branch of lsst/ap_verify_ci_hits2015 is the only configured dataset but this job has been tested as working with multiple branches. Eg. --- ap_verify: datasets: - name: CI-HiTS2015 github_repo: lsst/ap_verify_ci_hits2015 git_ref: master clone_timelimit: 15 retries: 3 run_timelimit: 60 - name: CI-HiTS2015 github_repo: lsst/ap_verify_ci_hits2015 git_ref: tickets/DM- 15872 clone_timelimit: 15 retries: 1 run_timelimit: 60 Any files output with the extensions .log and .json are collected as artifacts. A build should be triggered by the daily, weekly, and official docker image builds. I'm leaving this ticket as "in progress" to follow up after the weekly has run and to collect end-user feedback.
            Hide
            jhoblitt Joshua Hoblitt added a comment -

            Krzysztof Findeisen The run time on a test node (which are a slightly better spec than the jenkins production nodes until a change over next month) is ~9.5 mins. However, it appears to be running completely single threaded. The load average never goes over 1. This isn't an issue from a CI resource perspective – really just an FYI.

            Show
            jhoblitt Joshua Hoblitt added a comment - Krzysztof Findeisen The run time on a test node (which are a slightly better spec than the jenkins production nodes until a change over next month) is ~9.5 mins. However, it appears to be running completely single threaded. The load average never goes over 1. This isn't an issue from a CI resource perspective – really just an FYI.
            Hide
            jhoblitt Joshua Hoblitt added a comment -

            It would also be great if folks could take a look at the archived files and see if any needed outputs are missing or if unneeded files are being captured. Eg.

            https://ci.lsst.codes/blue/organizations/jenkins/scipipe%2Fap_verify/detail/ap_verify/1/artifacts

            Show
            jhoblitt Joshua Hoblitt added a comment - It would also be great if folks could take a look at the archived files and see if any needed outputs are missing or if unneeded files are being captured. Eg. https://ci.lsst.codes/blue/organizations/jenkins/scipipe%2Fap_verify/detail/ap_verify/1/artifacts
            Hide
            krzys Krzysztof Findeisen added a comment -

            I wonder if we could somehow use dataset_config.yaml, to avoid duplicating the dataset location... though perhaps that should wait until DM-12850 cleans up the ap_verify config situation.

            The single-threaded behavior is a known issue (DM-13887). 9.5 minutes is only slightly worse than the results quoted in DM-15742.

            Artifacts look fine to me.

            Show
            krzys Krzysztof Findeisen added a comment - I wonder if we could somehow use dataset_config.yaml , to avoid duplicating the dataset location... though perhaps that should wait until DM-12850 cleans up the ap_verify config situation. The single-threaded behavior is a known issue ( DM-13887 ). 9.5 minutes is only slightly worse than the results quoted in DM-15742 . Artifacts look fine to me.
            Hide
            krzys Krzysztof Findeisen added a comment -

            So, what happens if the new build fails? Do we get notifications?

            Show
            krzys Krzysztof Findeisen added a comment - So, what happens if the new build fails? Do we get notifications?
            Hide
            jhoblitt Joshua Hoblitt added a comment -

            The yaml config file can be downloaded instead of coming from a git clone but that url would need to be configured somwhere or hard-coded.

            Notifications (success and failure) are posted to the #dmj-s_ap_verify slack channel.

            Are changes needed at this time or is this task complete?

            Show
            jhoblitt Joshua Hoblitt added a comment - The yaml config file can be downloaded instead of coming from a git clone but that url would need to be configured somwhere or hard-coded. Notifications (success and failure) are posted to the #dmj-s_ap_verify slack channel. Are changes needed at this time or is this task complete?
            Hide
            krzys Krzysztof Findeisen added a comment -

            It looks good to me; do John Swinbank or Eric Bellm have any objections to the current system?

            Show
            krzys Krzysztof Findeisen added a comment - It looks good to me; do John Swinbank or Eric Bellm have any objections to the current system?
            Hide
            swinbank John Swinbank added a comment -

            No objections from me! Thank you both for getting this going.

            Show
            swinbank John Swinbank added a comment - No objections from me! Thank you both for getting this going.

              People

              • Assignee:
                jhoblitt Joshua Hoblitt
                Reporter:
                krzys Krzysztof Findeisen
                Watchers:
                Eric Bellm, John Swinbank, Joshua Hoblitt, Krzysztof Findeisen, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: