Publish verification jobs produced by the HSC reprocessing to SQuaSH

XMLWordPrintable

Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
7
• Team:
SQuaRE

Description

HSC weekly re processing is now producing verification jobs, located at

  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json 

This ticket is to start a discussion of what is needed to send those results to SQuaSH.

• dispatch_verify.py is used to send a verification job to SQuaSH, it has the --env option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc.

For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps ldf. My initial suggestion for the environment variables is:

• DATASET: name of the database being processed, e.g HSC RC2
• DATASET_REPO_URL: do we have a git lfs repo for the dataset?
• RUN_ID : can we use the associated jira ticket to identify this run? following https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset it looks like a good idea, is there a better identifier for the runs?
• RUN_ID_URL: could be the corresponding jira ticket URL
• VERSION_TAG: the LSST stack version used, e.g w_2017_14

Once this new environment is created in dispatch_verify.py an example of command line to publish the results to SQuaSH would be:

 $export DATASET="HSC RC2" $ export DATASET_REPO_URL="" $export RUN_ID="DM-10084" $ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export VERSION_TAG="w_2017_14"   $ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json   

Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance.

Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the --lsstsw but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build.

• Simon Krughoff mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use dispatch_verify.py to combine multiple verification jobs into a single JSON if that is required (see https://sqr-019.lsst.io/#Post-processing-verification-jobs) but this wouldn't scale if we process many patches which will be the case.

Activity

Angelo Fausti created issue -
Field Original Value New Value
Epic Link DM-13784 [ 39331 ]
 Risk Score 0
 Epic Link DM-13784 [ 39331 ] DM-13785 [ 39332 ]
 Description HSC weekly re processing is now producing verification jobs. located at {{ /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json}}   (assuming those are verification jobs This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see https://sqr-019.lsst.io/#Post-processing-verification-jobs) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs. located at {{ /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json}}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see https://sqr-019.lsst.io/#Post-processing-verification-jobs) but this wouldn't scale if we process many patches which will be the case.
 Description HSC weekly re processing is now producing verification jobs. located at {{ /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json}}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see https://sqr-019.lsst.io/#Post-processing-verification-jobs) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs. located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case.
 Description HSC weekly re processing is now producing verification jobs. located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs. located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case.
 Description HSC weekly re processing is now producing verification jobs. located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs. located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case.
 Description HSC weekly re processing is now producing verification jobs. located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results do SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs, located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results to SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc. For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case.
 Assignee Simon Krughoff [ krughoff ] Angelo Fausti [ afausti ]
 Watchers Angelo Fausti, Hsin-Fang Chiang [ Angelo Fausti, Hsin-Fang Chiang ] Angelo Fausti, Hsin-Fang Chiang, Simon Krughoff [ Angelo Fausti, Hsin-Fang Chiang, Simon Krughoff ]
 Epic Link DM-13785 [ 39332 ] DM-14312 [ 63991 ]
 Status To Do [ 10001 ] In Progress [ 3 ]
 Description HSC weekly re processing is now producing verification jobs, located at {noformat}  /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat}   This ticket is to start a discussion of what is needed to send those results to SQuaSH. - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc. For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs? - {{RUN_ID_URL}}: could be the corresponding jira ticket URL - {{STACK_VERSION}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat} $export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export STACK_VERSION="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build. - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs, located at {noformat} /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat} This ticket is to start a discussion of what is needed to send those results to SQuaSH.  - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc. For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs?  - {{RUN_ID_URL}}: could be the corresponding jira ticket URL  - {{VERSION_TAG}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat}$export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export VERSION_TAG="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build.  - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case.
 Link This issue relates to DM-13970 [ DM-13970 ]
 Description HSC weekly re processing is now producing verification jobs, located at {noformat} /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat} This ticket is to start a discussion of what is needed to send those results to SQuaSH.  - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc. For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good ideia, is there a better identifier for the runs?  - {{RUN_ID_URL}}: could be the corresponding jira ticket URL  - {{VERSION_TAG}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat}$export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export VERSION_TAG="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build.  - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case. HSC weekly re processing is now producing verification jobs, located at {noformat} /datasets/hsc/repo/rerun/RC/w_2018_17/DM-14055/validateDrp/matchedVisitMetrics/*/*/*json {noformat} This ticket is to start a discussion of what is needed to send those results to SQuaSH.  - {{dispatch_verify.py}} is used to send a verification job to SQuaSH, it has the {{--env}} option that can be used to grab information from the environment which is added to the job metatada. That's used by SQuaSH to identify the dataset being processed, the ID of the run, URLs linking to additional information about the run, the stack version used, etc. For CI we defined a "jenkins" enviroment, here we need to create another environment option named perhaps {{ldf}}. My initial suggestion for the environment variables is:  - {{DATASET}}: name of the database being processed, e.g HSC RC2  - {{DATASET_REPO_URL}}: do we have a git lfs repo for the dataset?  - {{RUN_ID}} : can we use the associated jira ticket to identify this run? following [https://confluence.lsstcorp.org/display/DM/Reprocessing+of+the+HSC+RC+dataset] it looks like a good idea, is there a better identifier for the runs?  - {{RUN_ID_URL}}: could be the corresponding jira ticket URL  - {{VERSION_TAG}}: the LSST stack version used, e.g {{w_2017_14}} Once this new environment is created in {{dispatch_verify.py}} an example of command line to publish the results to SQuaSH would be: {noformat}$export DATASET="HSC RC2"$ export DATASET_REPO_URL="" $export RUN_ID="DM-10084"$ export RUN_ID_URL="https://jira.lsstcorp.org/browse/DM-10084" $export VERSION_TAG="w_2017_14"$ dispatch_verify.py --url https://squash-restful-api-demo.lsst.codes --user --password --env ldf --lsstsw lsstsw/ output/verify/job.json {noformat} Note that the above URL points to a demo instance of SQuaSH so we can test as needed without affecting the production instance. Here I am assuming we have an lsstw stack installation and thus access to the manifest file with the versions of the stack packages. We can supress the {{--lsstsw}} but it would be useful to carry on this information to compare which stack packages changed from weekly to weekly build.  - [~krughoff] mentioned that we have one verification job per patch. I guess, as a start, we could publish result for one patch only? Later, one way to do that would be to add the patch id as a job metadata so that we can distinguish them on SQuaSH. Also, we can use {{dispatch_verify.py}} to combine multiple verification jobs into a single JSON if that is required (see [https://sqr-019.lsst.io/#Post-processing-verification-jobs]) but this wouldn't scale if we process many patches which will be the case.
 Attachment Screen Shot 2018-07-17 at 4.24.55 PM.png [ 33402 ]
 Story Points 7
 Resolution Done [ 10000 ] Status In Progress [ 3 ] Done [ 10002 ]
 Component/s squash [ 14169 ]

People

• Assignee:
Angelo Fausti
Reporter:
Angelo Fausti
Watchers:
Angelo Fausti, Hsin-Fang Chiang, John Parejko, John Swinbank, Jonathan Sick, Krzysztof Findeisen, Simon Krughoff