Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-4817

Read and understand `ci_hsc` and plan relationship with `validate_drp`

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Read through and run the `ci_hsc` tests and plan for how this module and efforts should relate to `validate_drp`.

      a. Add capabilities to `validate_drp` to run the tests in `ci_hsc`.
      b. Compare frameworks.
      c. Plan for how such validation and continuous integration data sets should be constructed.

        Attachments

          Issue Links

            Activity

            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Look as well at

            https://jira.lsstcorp.org/browse/DM-4730

            which enables doing comparisons across two different analyses of the same data.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Look as well at https://jira.lsstcorp.org/browse/DM-4730 which enables doing comparisons across two different analyses of the same data.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment - - edited

            The brief answer is that validate_drp picks up just where ci_hsc leaves off.

            Running

            validateDrp.py ${CI_HSC_DIR}/DATA
            

            on the output ci_hsc repo produces useful plots and summary statistics on the processing of the ci_hsc data.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - - edited The brief answer is that validate_drp picks up just where ci_hsc leaves off. Running validateDrp.py ${CI_HSC_DIR}/DATA on the output ci_hsc repo produces useful plots and summary statistics on the processing of the ci_hsc data.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            `ci_hsc` is a Scons-based running of many tests. Should long-term move to more Task-driven interface that can be driven from command-line, Scons, or called from outside.

            This review and thinking led to the following two tickets for further work.

            Continuous integration will occupy a spectrum toward validation. With a need for quick 5- minute CI checks with "small" all the way through to nightly runs with "medium" data sets, and even weekly full builds with "large" data sets full focal planes and deep coadds. ci_hsc currently occupies the "nightly", or "medium" scale data set size.

            1. Provide general comparison sets for data. Both input and processed. This provides convenient comparisons for the developer and enables a range of quick-to-full-scale tests against "known" results for quick-turnaround CI up to full focal-plane coadd comparisons in nightly builds..
            DM-5147 - Provide usable repos in `validation_data_*` packages.
            2. Compare changes to specific measurements of objects when processing the same data through different pipelines or with different algorithms or configurations.
            DM-5270 - Provide comparison routines for comparing two repos of the same data

            The comparison routines in u/laurenm/DM-4730 will map relatively easily to general cameras with some work to remove HSC-specific prints and text annotation and some robustness against cameras or sources not having certain properties.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - `ci_hsc` is a Scons-based running of many tests. Should long-term move to more Task-driven interface that can be driven from command-line, Scons, or called from outside. This review and thinking led to the following two tickets for further work. Continuous integration will occupy a spectrum toward validation. With a need for quick 5- minute CI checks with "small" all the way through to nightly runs with "medium" data sets, and even weekly full builds with "large" data sets full focal planes and deep coadds. ci_hsc currently occupies the "nightly", or "medium" scale data set size. 1. Provide general comparison sets for data. Both input and processed. This provides convenient comparisons for the developer and enables a range of quick-to-full-scale tests against "known" results for quick-turnaround CI up to full focal-plane coadd comparisons in nightly builds.. DM-5147 - Provide usable repos in `validation_data_*` packages. 2. Compare changes to specific measurements of objects when processing the same data through different pipelines or with different algorithms or configurations. DM-5270 - Provide comparison routines for comparing two repos of the same data The comparison routines in u/laurenm/ DM-4730 will map relatively easily to general cameras with some work to remove HSC-specific prints and text annotation and some robustness against cameras or sources not having certain properties.

              People

              • Assignee:
                wmwood-vasey Michael Wood-Vasey
                Reporter:
                wmwood-vasey Michael Wood-Vasey
                Watchers:
                John Swinbank, Michael Wood-Vasey
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel