Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-32211

Test eotask-gen3 at NCSA

    XMLWordPrintable

Details

    • Improvement
    • Status: Done
    • Resolution: Done
    • None
    • None
    • None

    Description

      Confirm that eotask-gen3 works at NCSA, and provide command examples.

      Attachments

        Issue Links

          Activity

            Current issues:

            • Needs to switch to using lsst.utils.introspection.get_full_type_name.
            • Method eoSelectSuperFlatHigh has an incorrect less than sign (should be greater than).
            • Missing physical_filter dimension on a flat input.

            Concerns:

            • There are additional data restrictions that exclude data that doesn't match hardcoded expectations.  As an example, the eoSelectSuperFlatHigh requires exposure.observation_type == 'flat' AND exposure.observation_reason == 'sflat'.  This removes the ability to pass the exposures to process as part of the pipetask query.
            • If the data restrictions exclude all inputs, the task will raise, stopping all processing.
            • eoTask products are not in the same assembly state as those from cp_pipe, making direct comparisons difficult.
            • eoDefects creates more "large box" defects than cp_pipe.  It does a better job of identifying "dot-like" defects, but misses the LATISS C11 bad column.
            czw Christopher Waters added a comment - Current issues: Needs to switch to using lsst.utils.introspection.get_full_type_name . Method eoSelectSuperFlatHigh has an incorrect less than sign (should be greater than). Missing physical_filter dimension on a flat input. Concerns: There are additional data restrictions that exclude data that doesn't match hardcoded expectations.  As an example, the eoSelectSuperFlatHigh requires exposure.observation_type == 'flat' AND exposure.observation_reason == 'sflat' .  This removes the ability to pass the exposures to process as part of the pipetask query. If the data restrictions exclude all inputs, the task will raise , stopping all processing. eoTask products are not in the same assembly state as those from cp_pipe, making direct comparisons difficult. eoDefects creates more "large box" defects than cp_pipe.  It does a better job of identifying "dot-like" defects, but misses the LATISS C11 bad column.

            One known issue that I ran into was that none of the eoTask tasks were able to operate with CALIBRATION collections.  After debugging this, and comparing to other tasks, I was able to narrow down the issue to an incorrect set of dimensions for the task connections.

            The existing dimensions are ("instrument", "detector").  The first issue with this is that it prevents parallelization of the ISR processing steps, as all inputs are passed into the task at once.  This creates the collection issue because having multiple input exposure values prevents the butler from determining the correct timerange to look up the calibrations (as it assumes there is a single unique dateobs for the inputs).

            Using a list of run collections avoids this error, but breaks the utility of using the butler: having it look up the appropriate calibrations.  Switching these tasks to use a map-reduce design, where we map any ISR/image processing steps, and reduce those results to create the eoTask result set, would solve this issue.

            czw Christopher Waters added a comment - One known issue that I ran into was that none of the eoTask tasks were able to operate with CALIBRATION collections.  After debugging this, and comparing to other tasks, I was able to narrow down the issue to an incorrect set of dimensions for the task connections. The existing dimensions are ("instrument", "detector") .  The first issue with this is that it prevents parallelization of the ISR processing steps, as all inputs are passed into the task at once.  This creates the collection issue because having multiple input exposure values prevents the butler from determining the correct timerange to look up the calibrations (as it assumes there is a single unique dateobs for the inputs). Using a list of run collections avoids this error, but breaks the utility of using the butler: having it look up the appropriate calibrations.  Switching these tasks to use a map-reduce design, where we map any ISR/image processing steps, and reduce those results to create the eoTask result set, would solve this issue.

            For the above concerns: the data selection can be overriden in the pipeline to use the any selector.  This allows user specified exposures to be processed, without being excluded.  This avoids the raise issue as well.

            czw Christopher Waters added a comment - For the above concerns: the data selection can be overriden in the pipeline to use the any selector.  This allows user specified exposures to be processed, without being excluded.  This avoids the raise issue as well.
            tjenness Tim Jenness added a comment -

            Please can we not have "gen3" in the name? We can see a very near future where "gen2" no longer exists and in a year people are going to be wondering what "gen3" means and why is it in the name of the package.

            tjenness Tim Jenness added a comment - Please can we not have "gen3" in the name? We can see a very near future where "gen2" no longer exists and in a year people are going to be wondering what "gen3" means and why is it in the name of the package.

            I've marked this as done, as I don't think there's any further useful work to do.

            czw Christopher Waters added a comment - I've marked this as done, as I don't think there's any further useful work to do.

            People

              czw Christopher Waters
              czw Christopher Waters
              Christopher Waters, Merlin Fisher-Levine, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.