Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13259

Enable dispatch_verify in validate_drp Jenkins job

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: Continuous Integration
    • Labels:
      None

      Description

      validate_drp was recently ported to the new verification framework (DM-12253) however no results are sent to SQuaSH because a redesign of the RESTful API was needed (DM-12194). 

      The SQuaSH API was reimplemented in Flask and we deployed a demo instance that can be used to exercise the full pipeline.

      https://squash-restful-api-demo.lsst.codes

      For that,  dispatch_verify must be enabled in the Jenkins pipeline.

      Assuming you have a verify job JSON document, you can execute dispatch_verify from the lsstsw folder like so:

       

       
      $ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user <user name>  --password <passwd> --env jenkins --lsstsw $(pwd) Cfht_output_r.json
      ---
      verify.bin.dispatchverify.main INFO: Loading Cfht_output_r.json
      verify.bin.dispatchverify.main INFO: Refreshing metric definitions from verify_metrics
      verify.bin.dispatchverify.main INFO: Inserting lsstsw package metadata from /Users/afausti/Projects/lsstsw/lsstsw.
      verify.bin.dispatchverify.main INFO: Inserting Jenkins CI environment metadata.
      verify.bin.dispatchverify.main INFO: Uploading Job JSON to https://squash-restful-api-demo.lsst.codes.
      verify.squash.get INFO: GET https://squash-restful-api-demo.lsst.codes status: 200
      verify.squash.post INFO: POST https://squash-restful-api-demo.lsst.codes/auth status: 200
      verify.squash.post INFO: POST https://squash-restful-api-demo.lsst.codes/job status: 201
       
      

       

        Attachments

          Issue Links

            Activity

            Hide
            afausti Angelo Fausti added a comment - - edited

            The errors above where caused by OOM using the default GKE node n1-standard-1 which, after system allocation, has 2.77G free.

            In this deployment we have 5 containers nginx, api, celery-worker, redis and google-cloud-sql-proxy and we are pushing a 400M file through the API so, yeah, it looks like a larger node was required. Moving the deployment to a n1-standard-4 node which has ~14G available fixed the problem.

            Also had to increase the timeout for requests on the client side and the JWT_EXPIRATION_DELTA in the app configuration to avoid a

            flask_jwt.JWTError: Invalid token. Signature has expired
            

            error.

            $ dispatch_verify.py  --env jenkins --lsstsw $(pwd) --url https://squash-restful-api-demo.lsst.codes/ --user <user> --password <passwd> data_hsc_rerun_20170105_HSC-I.json
            verify.bin.dispatchverify.main INFO: Loading data_hsc_rerun_20170105_HSC-I.json
            verify.bin.dispatchverify.main INFO: Refreshing metric definitions from verify_metrics
            verify.bin.dispatchverify.main INFO: Inserting lsstsw package metadata from /Users/afausti/Projects/lsstsw/lsstsw.
            verify.bin.dispatchverify.main INFO: Inserting Jenkins CI environment metadata.
            verify.bin.dispatchverify.main INFO: Uploading Job JSON to https://squash-restful-api-demo.lsst.codes/.
            verify.squash.get INFO: GET https://squash-restful-api-demo.lsst.codes/ status: 200
            verify.squash.post INFO: POST https://squash-restful-api-demo.lsst.codes/auth status: 200
            verify.squash.post INFO: POST https://squash-restful-api-demo.lsst.codes/job status: 202
            

            Show
            afausti Angelo Fausti added a comment - - edited The errors above where caused by OOM using the default GKE node n1-standard-1 which, after system allocation, has 2.77G free. In this deployment we have 5 containers nginx , api , celery-worker , redis and google-cloud-sql-proxy and we are pushing a 400M file through the API so, yeah, it looks like a larger node was required. Moving the deployment to a n1-standard-4 node which has ~14G available fixed the problem. Also had to increase the timeout for requests on the client side and the JWT_EXPIRATION_DELTA in the app configuration to avoid a flask_jwt.JWTError: Invalid token. Signature has expired error. $ dispatch_verify.py --env jenkins --lsstsw $(pwd) --url https: //squash-restful-api-demo.lsst.codes/ --user <user> --password <passwd> data_hsc_rerun_20170105_HSC-I.json verify.bin.dispatchverify.main INFO: Loading data_hsc_rerun_20170105_HSC-I.json verify.bin.dispatchverify.main INFO: Refreshing metric definitions from verify_metrics verify.bin.dispatchverify.main INFO: Inserting lsstsw package metadata from /Users/afausti/Projects/lsstsw/lsstsw. verify.bin.dispatchverify.main INFO: Inserting Jenkins CI environment metadata. verify.bin.dispatchverify.main INFO: Uploading Job JSON to https: //squash-restful-api-demo.lsst.codes/. verify.squash.get INFO: GET https: //squash-restful-api-demo.lsst.codes/ status: 200 verify.squash.post INFO: POST https: //squash-restful-api-demo.lsst.codes/auth status: 200 verify.squash.post INFO: POST https: //squash-restful-api-demo.lsst.codes/job status: 202
            Hide
            afausti Angelo Fausti added a comment - - edited

            Joshua Hoblitt the changes in the verify package were merged to master in DM-13394. However with Adam Thornton we are restoring the data from produciton so I think it make sense to way for DM-12604 before switching on dispatch_verify.py in Jenkins.

            Show
            afausti Angelo Fausti added a comment - - edited Joshua Hoblitt the changes in the verify package were merged to master in DM-13394 . However with Adam Thornton we are restoring the data from produciton so I think it make sense to way for DM-12604 before switching on dispatch_verify.py in Jenkins.
            Hide
            jhoblitt Joshua Hoblitt added a comment -

            OK - it would also be useful to have a test instance to develop against.

            Show
            jhoblitt Joshua Hoblitt added a comment - OK - it would also be useful to have a test instance to develop against.
            Hide
            afausti Angelo Fausti added a comment - - edited

            dispach_verify can be enabled in the validate_drp job now using the demo instance at https://squash-restful-api-demo.lsst.codes

            A typical command line execution is:

            $ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user <user> --password <passwd> --env jenkins --lsstsw $(pwd) Cfht_output_r.json
            

            Show
            afausti Angelo Fausti added a comment - - edited dispach_verify can be enabled in the validate_drp job now using the demo instance at https://squash-restful-api-demo.lsst.codes A typical command line execution is: $ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user <user> --password <passwd> --env jenkins --lsstsw $(pwd) Cfht_output_r.json
            Hide
            afausti Angelo Fausti added a comment -

            This can be closed, Jenkins jobs are sending data to SQuaSH using dispatch_verify.py now.

            Show
            afausti Angelo Fausti added a comment - This can be closed, Jenkins jobs are sending data to SQuaSH using dispatch_verify.py now.

              People

              Assignee:
              jhoblitt Joshua Hoblitt
              Reporter:
              afausti Angelo Fausti
              Watchers:
              Angelo Fausti, Jonathan Sick, Joshua Hoblitt, Simon Krughoff
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.