Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24247

butler validation error in ci_hsc_gen3

    XMLWordPrintable

Details

    • 0.5
    • Architecture
    • No

    Description

      npease ran the butler config validation on the ci_hsc_gen3 output repo and got the following report:

      Template failure with key 'default': Template '{run:/}/{datasetType}.{component:?}/{tract:?}/{patch:?}/{label:?}/{abstract_filter:?}/{subfilter:?}/{physical_filter:?}/{visit:?}/{datasetType}_{component:?}_{tract:?}_{patch:?}_{label:?}_{abstract_filter:?}_{physical_filter:?}_{calibration_label:?}_{visit:?}_{exposure:?}_{detector:?}_{instrument:?}_{skymap:?}_{skypix:?}_{run}' is inconsistent with DatasetType(ps1_pv3_3pi_20170110, {htm7}, SimpleCatalog): {'exposure', 'patch', 'skypix', 'abstract_filter', 'tract', 'label', 'physical_filter', 'detector', 'instrument', 'calibration_label', 'skymap', 'subfilter', 'visit'} is not a superset of {'htm7'}.
      

      Fix the template.

      Attachments

        Issue Links

          Activity

            tjenness Tim Jenness added a comment -

            I assume the problem is that htm7 is not in the template but since we never "put" any of these dataset types that's completely irrelevant.

            jbosch Given that the dataset type name is completely unpredictable, how should the validation command know that this dataset type should be ignored? Is there a special dimensions definition for these that could tell us?

            tjenness Tim Jenness added a comment - I assume the problem is that htm7 is not in the template but since we never "put" any of these dataset types that's completely irrelevant. jbosch Given that the dataset type name is completely unpredictable, how should the validation command know that this dataset type should be ignored? Is there a special dimensions definition for these that could tell us?
            jbosch Jim Bosch added a comment -

            Nothing now. My first reaction to this ticket was that it's always going to be dangerous to try to validate all dataset types, especially since other users can add them to a shared repo and you shouldn't care, so maybe we should only validate an explicitly given set.

            For `htm7` and other skypix dimensions, we should probably move the special handling that replaces "skypix" with real dimension names from pipe_base to daf_butler, and that would also take care of the immediate problem.

            jbosch Jim Bosch added a comment - Nothing now. My first reaction to this ticket was that it's always going to be dangerous to try to validate all dataset types, especially since other users can add them to a shared repo and you shouldn't care, so maybe we should only validate an explicitly given set. For `htm7` and other skypix dimensions, we should probably move the special handling that replaces "skypix" with real dimension names from pipe_base to daf_butler, and that would also take care of the immediate problem.
            tjenness Tim Jenness added a comment -

            Template validation should be fairly reliable. It's only asking that every dimension in the dataset type is somewhere in the template.

            Formatter validation is a bit trickier since dataset types that are ingest only and never put don't need to have a properly configured formatter configuration.

            tjenness Tim Jenness added a comment - Template validation should be fairly reliable. It's only asking that every dimension in the dataset type is somewhere in the template. Formatter validation is a bit trickier since dataset types that are ingest only and never put don't need to have a properly configured formatter configuration.
            tjenness Tim Jenness added a comment -

            Where in pipe_base does it know that htm7 is a skypix? Alternatively if I knew the options I could always say that if I see htm\d I turn those directly into skypix for validation.

            tjenness Tim Jenness added a comment - Where in pipe_base does it know that htm7 is a skypix? Alternatively if I knew the options I could always say that if I see htm\d I turn those directly into skypix for validation.
            jbosch Jim Bosch added a comment -

            https://github.com/lsst/pipe_base/blob/master/python/lsst/pipe/base/pipeline.py#L468

            I'm not sure if the best place to put this is in Registry.getDatasetType or something similar that could be used to put a DatasetType into "standard form", or something more localized to the template code. Ideally it'd also be a place we could guarantee that PlaceholderStorageClasses are replaced with real storage classes; I think that's a pretty similar problem.

            jbosch Jim Bosch added a comment - https://github.com/lsst/pipe_base/blob/master/python/lsst/pipe/base/pipeline.py#L468 I'm not sure if the best place to put this is in Registry.getDatasetType or something similar that could be used to put a DatasetType into "standard form", or something more localized to the template code. Ideally it'd also be a place we could guarantee that PlaceholderStorageClasses are replaced with real storage classes; I think that's a pretty similar problem.
            tjenness Tim Jenness added a comment -

            I don't really understand. I have the DatasetRef so I have the DatasetType. The DatasetType includes htm7 as a dimension. My problem is not getting the dataset type, my problem (I think) is understanding that htm7 is a skypix.

            tjenness Tim Jenness added a comment - I don't really understand. I have the DatasetRef so I have the DatasetType. The DatasetType includes htm7 as a dimension. My problem is not getting the dataset type, my problem (I think) is understanding that htm7 is a skypix.
            tjenness Tim Jenness added a comment -

            Actually, the FileTemplate code already does that look up so I need to reuse that code. That doesn't seem bad.

            tjenness Tim Jenness added a comment - Actually, the FileTemplate code already does that look up so I need to reuse that code. That doesn't seem bad.
            tjenness Tim Jenness added a comment -

            Turned out to be a quick fix. There are a couple of cleanup commits as well.

            tjenness Tim Jenness added a comment - Turned out to be a quick fix. There are a couple of cleanup commits as well.

            People

              tjenness Tim Jenness
              tjenness Tim Jenness
              Jim Bosch
              Jim Bosch, Michelle Gower, Nate Pease [X] (Inactive), Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.