Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-10088

Design verify_ap

    XMLWordPrintable

    Details

      Description

      Once the design of validate_drp (or another more desirable validation framework) is understood, produce a similar design for the verify_ap product. I will not dictate the format of that design here, but this issue should be resolved by linking to it in some way.

        Attachments

          Issue Links

            Activity

            Hide
            mrawls Meredith Rawls added a comment - - edited

            The verify_ap system will bring together the Minimum Viable System (MVS) with LSST's verify framework. Its primary role is to automatically process a test dataset through the Level 1 pipeline via continuous integration so that Alert Production (AP) performance metrics can be monitored. It will also be straightforward for any user to run on a test dataset of their choice. The main components of verify_ap are summarized nicely in the Venn Diagram by Eric Bellm on the MVS Confluence Page.

            Currently, a prototype version of the MVS portion of verify_ap lives on github in lsst-dm/decam_hits. The intention is to rename this repo to ap_pipe (which is called by verify_ap) once it more closely resembles a finished product.

            Our first goal is to have a functional running demonstration of verify_ap in time for the August 2017 Project & Community "All Hands" Meeting. The work is being done by a core team in UW DM, led by Eric and Simon: Meredith, David, Chris, and Krzysztof.

            The main pieces that need to be done are grouped into epics as follows:

            • DM-9676: Promote prototype AP system to verify_ap
            • DM-10770: Implement initial metrics for MVS
            • DM-10771: Identify and procure datasets for calculating metrics for MVS
            • DM-10773: Design and implement MVS for alert production

            My (Meredith's) main role in the next several weeks is to get the MVS in a more useful state so that it can be plugged into the verify_ap system. David is laying the groundwork for verify_ap datasets. The initial test dataset needs to be easily accessible by a continuous integration system, and it also should be straightforward to use verify_ap with other datasets in the future. Chris' piece fits in at the end of the proto-MVS, which currently yields DIASources but will need to create the final data product, DIAObjects. Krzysztof is working on initial metrics, such as runtime measures and counting the number of DIAObjects created. He will also begin making a high-level verify_ap package from which the MVS and friends can be called.

            Show
            mrawls Meredith Rawls added a comment - - edited The verify_ap system will bring together the Minimum Viable System (MVS) with LSST's verify framework. Its primary role is to automatically process a test dataset through the Level 1 pipeline via continuous integration so that Alert Production (AP) performance metrics can be monitored. It will also be straightforward for any user to run on a test dataset of their choice. The main components of verify_ap are summarized nicely in the Venn Diagram by Eric Bellm on the MVS Confluence Page . Currently, a prototype version of the MVS portion of verify_ap lives on github in lsst-dm/decam_hits . The intention is to rename this repo to ap_pipe (which is called by verify_ap) once it more closely resembles a finished product. Our first goal is to have a functional running demonstration of verify_ap in time for the August 2017 Project & Community "All Hands" Meeting. The work is being done by a core team in UW DM, led by Eric and Simon: Meredith, David, Chris, and Krzysztof. The main pieces that need to be done are grouped into epics as follows: DM-9676 : Promote prototype AP system to verify_ap DM-10770 : Implement initial metrics for MVS DM-10771 : Identify and procure datasets for calculating metrics for MVS DM-10773 : Design and implement MVS for alert production My (Meredith's) main role in the next several weeks is to get the MVS in a more useful state so that it can be plugged into the verify_ap system. David is laying the groundwork for verify_ap datasets. The initial test dataset needs to be easily accessible by a continuous integration system, and it also should be straightforward to use verify_ap with other datasets in the future. Chris' piece fits in at the end of the proto-MVS, which currently yields DIASources but will need to create the final data product, DIAObjects. Krzysztof is working on initial metrics, such as runtime measures and counting the number of DIAObjects created. He will also begin making a high-level verify_ap package from which the MVS and friends can be called.
            Hide
            mrawls Meredith Rawls added a comment -

            The comment above is a summary of what we have recently discussed for verify_ap, and I think provides a good outline for the verify_ap-related work that needs doing over the next several weeks. Can you please take a look and see if I've missed anything? Thank you!

            Show
            mrawls Meredith Rawls added a comment - The comment above is a summary of what we have recently discussed for verify_ap, and I think provides a good outline for the verify_ap-related work that needs doing over the next several weeks. Can you please take a look and see if I've missed anything? Thank you!
            Hide
            ebellm Eric Bellm added a comment -

            Hi Meredith--could you summarize here the whiteboard design for the package that you, Simon, and I did yesterday?

            I also think the decam_hits package should probably migrate into a package called ap_pipe, that is then called by verify_ap.

            Show
            ebellm Eric Bellm added a comment - Hi Meredith--could you summarize here the whiteboard design for the package that you, Simon, and I did yesterday? I also think the decam_hits package should probably migrate into a package called ap_pipe, that is then called by verify_ap.
            Hide
            mrawls Meredith Rawls added a comment -

            Some additional details of the verify_ap design that were discussed by Eric, Simon, and Meredith on 6/29 are below.

            Running verify_ap:

            • Continuous integration (probably the new and improved coming soon Jenkins)
            • Also possible to run from the command line

            Datasets used by verify_ap:

            • Will be eups packages that can be setup
            • Make obs_package a dependence of the data repo by adding it to the dataset eups table file
            • Include a stub input repo with a mapper file
            • Dataset directory structure has /data/*.fz and /input_repo/_mapper in it

            Current state of MVS (aka the script in lsst-dm/decam_hits that will be called ap_pipe):

            • ingest
            • ingest calibs
            • process (processCcd)
            • different image
            • Not yet implemented: association (turning DIASources into DIAObjects)
            • Not yet implemented: afterburners (for measuring metrics etc.)
            • The afterburners are not technically part of the MVS, but are called immediately after

            Pseudocode for main verify_ap routine (note this also appears in DM-11118):

            main():
                find input data
                "--dataset"
                load_dataset_config --> yaml file with stuff like "HiTS" : "hits_data"
                (for starters, self-contained datasets that we know will work)
                input_repo_dir = eups.getPackageDir(datasetname)"mapper" : "obs_package"
                ?? eups setup obs_package ?? (supposedly this can be done within python?)
                "--output"
                copy stub input repo to output
                "--dataIdString" (note: often just --Id as a command line string in other places)
                ap_pipe.ingest_raws(data_location, repo_location)
                ap_pipe.ingest_calibs(...)
                ap_pipe.process(repo_location, rerun, parallelization_level, dataIdString)
                --> how to get templates? in same repo with input data?
                ap_pipe.diff(repo_location, rerun, parallelization_level, dataIdString)
                ap_pipe.assoc(repo_location, rerun)
                ap_pipe.afterburner?? --> the afterburner could live in the same package as main, and be called like
                verify_ap_afterburner
            

            Show
            mrawls Meredith Rawls added a comment - Some additional details of the verify_ap design that were discussed by Eric, Simon, and Meredith on 6/29 are below. Running verify_ap: Continuous integration (probably the new and improved coming soon Jenkins) Also possible to run from the command line Datasets used by verify_ap: Will be eups packages that can be setup Make obs_package a dependence of the data repo by adding it to the dataset eups table file Include a stub input repo with a mapper file Dataset directory structure has /data/*.fz and /input_repo/_mapper in it Current state of MVS (aka the script in lsst-dm/decam_hits that will be called ap_pipe): ingest ingest calibs process (processCcd) different image Not yet implemented: association (turning DIASources into DIAObjects) Not yet implemented: afterburners (for measuring metrics etc.) The afterburners are not technically part of the MVS, but are called immediately after Pseudocode for main verify_ap routine (note this also appears in DM-11118 ): main(): find input data "--dataset" load_dataset_config --> yaml file with stuff like "HiTS" : "hits_data" (for starters, self-contained datasets that we know will work) input_repo_dir = eups.getPackageDir(datasetname)"mapper" : "obs_package" ?? eups setup obs_package ?? (supposedly this can be done within python?) "--output" copy stub input repo to output "--dataIdString" (note: often just --Id as a command line string in other places) ap_pipe.ingest_raws(data_location, repo_location) ap_pipe.ingest_calibs(...) ap_pipe.process(repo_location, rerun, parallelization_level, dataIdString) --> how to get templates? in same repo with input data? ap_pipe.diff(repo_location, rerun, parallelization_level, dataIdString) ap_pipe.assoc(repo_location, rerun) ap_pipe.afterburner?? --> the afterburner could live in the same package as main, and be called like verify_ap_afterburner
            Hide
            ebellm Eric Bellm added a comment -

            looks good!

            Show
            ebellm Eric Bellm added a comment - looks good!

              People

              Assignee:
              mrawls Meredith Rawls
              Reporter:
              krughoff Simon Krughoff
              Reviewers:
              Eric Bellm
              Watchers:
              Eric Bellm, Meredith Rawls, Simon Krughoff
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: