Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24247

butler validation error in ci_hsc_gen3

    XMLWordPrintable

    Details

    • Story Points:
      0.5
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      Nate Pease [X] ran the butler config validation on the ci_hsc_gen3 output repo and got the following report:

      Template failure with key 'default': Template '{run:/}/{datasetType}.{component:?}/{tract:?}/{patch:?}/{label:?}/{abstract_filter:?}/{subfilter:?}/{physical_filter:?}/{visit:?}/{datasetType}_{component:?}_{tract:?}_{patch:?}_{label:?}_{abstract_filter:?}_{physical_filter:?}_{calibration_label:?}_{visit:?}_{exposure:?}_{detector:?}_{instrument:?}_{skymap:?}_{skypix:?}_{run}' is inconsistent with DatasetType(ps1_pv3_3pi_20170110, {htm7}, SimpleCatalog): {'exposure', 'patch', 'skypix', 'abstract_filter', 'tract', 'label', 'physical_filter', 'detector', 'instrument', 'calibration_label', 'skymap', 'subfilter', 'visit'} is not a superset of {'htm7'}.
      

      Fix the template.

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            I assume the problem is that htm7 is not in the template but since we never "put" any of these dataset types that's completely irrelevant.

            Jim Bosch Given that the dataset type name is completely unpredictable, how should the validation command know that this dataset type should be ignored? Is there a special dimensions definition for these that could tell us?

            Show
            tjenness Tim Jenness added a comment - I assume the problem is that htm7 is not in the template but since we never "put" any of these dataset types that's completely irrelevant. Jim Bosch Given that the dataset type name is completely unpredictable, how should the validation command know that this dataset type should be ignored? Is there a special dimensions definition for these that could tell us?
            Hide
            jbosch Jim Bosch added a comment -

            Nothing now. My first reaction to this ticket was that it's always going to be dangerous to try to validate all dataset types, especially since other users can add them to a shared repo and you shouldn't care, so maybe we should only validate an explicitly given set.

            For `htm7` and other skypix dimensions, we should probably move the special handling that replaces "skypix" with real dimension names from pipe_base to daf_butler, and that would also take care of the immediate problem.

            Show
            jbosch Jim Bosch added a comment - Nothing now. My first reaction to this ticket was that it's always going to be dangerous to try to validate all dataset types, especially since other users can add them to a shared repo and you shouldn't care, so maybe we should only validate an explicitly given set. For `htm7` and other skypix dimensions, we should probably move the special handling that replaces "skypix" with real dimension names from pipe_base to daf_butler, and that would also take care of the immediate problem.
            Hide
            tjenness Tim Jenness added a comment -

            Template validation should be fairly reliable. It's only asking that every dimension in the dataset type is somewhere in the template.

            Formatter validation is a bit trickier since dataset types that are ingest only and never put don't need to have a properly configured formatter configuration.

            Show
            tjenness Tim Jenness added a comment - Template validation should be fairly reliable. It's only asking that every dimension in the dataset type is somewhere in the template. Formatter validation is a bit trickier since dataset types that are ingest only and never put don't need to have a properly configured formatter configuration.
            Hide
            tjenness Tim Jenness added a comment -

            Where in pipe_base does it know that htm7 is a skypix? Alternatively if I knew the options I could always say that if I see htm\d I turn those directly into skypix for validation.

            Show
            tjenness Tim Jenness added a comment - Where in pipe_base does it know that htm7 is a skypix? Alternatively if I knew the options I could always say that if I see htm\d I turn those directly into skypix for validation.
            Hide
            jbosch Jim Bosch added a comment -

            https://github.com/lsst/pipe_base/blob/master/python/lsst/pipe/base/pipeline.py#L468

            I'm not sure if the best place to put this is in Registry.getDatasetType or something similar that could be used to put a DatasetType into "standard form", or something more localized to the template code. Ideally it'd also be a place we could guarantee that PlaceholderStorageClasses are replaced with real storage classes; I think that's a pretty similar problem.

            Show
            jbosch Jim Bosch added a comment - https://github.com/lsst/pipe_base/blob/master/python/lsst/pipe/base/pipeline.py#L468 I'm not sure if the best place to put this is in Registry.getDatasetType or something similar that could be used to put a DatasetType into "standard form", or something more localized to the template code. Ideally it'd also be a place we could guarantee that PlaceholderStorageClasses are replaced with real storage classes; I think that's a pretty similar problem.
            Hide
            tjenness Tim Jenness added a comment -

            I don't really understand. I have the DatasetRef so I have the DatasetType. The DatasetType includes htm7 as a dimension. My problem is not getting the dataset type, my problem (I think) is understanding that htm7 is a skypix.

            Show
            tjenness Tim Jenness added a comment - I don't really understand. I have the DatasetRef so I have the DatasetType. The DatasetType includes htm7 as a dimension. My problem is not getting the dataset type, my problem (I think) is understanding that htm7 is a skypix.
            Hide
            tjenness Tim Jenness added a comment -

            Actually, the FileTemplate code already does that look up so I need to reuse that code. That doesn't seem bad.

            Show
            tjenness Tim Jenness added a comment - Actually, the FileTemplate code already does that look up so I need to reuse that code. That doesn't seem bad.
            Hide
            tjenness Tim Jenness added a comment -

            Turned out to be a quick fix. There are a couple of cleanup commits as well.

            Show
            tjenness Tim Jenness added a comment - Turned out to be a quick fix. There are a couple of cleanup commits as well.

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              tjenness Tim Jenness
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, Michelle Gower, Nate Pease [X] (Inactive), Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.