Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-11234

Gen 2 butler queryMetadata fails to look up bias keys when date is ambiguous

    XMLWordPrintable

    Details

    • Type: Story
    • Status: To Do
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: daf_persistence
    • Labels:
      None
    • Team:
      Data Access and Database

      Description

      I wanted to look up which rafts had available biases (actually I'd have preferred to look up all the fields – see DM-10632), so I tried butler.queryMetadata("bias", ['raft']) but it failed with this traceback. I'd have expected it to return all the distinct values that had one of the 239 valid dates. It'd be acceptable to return all 239 values without making them distinct.

      Note that bias is a calib dataset, it works if I use raw, but that's not useful.

      butler.queryMetadata("bias", ['raft'])
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/ssd/lsstsw/stack/Linux64/daf_persistence/13.0-20-g8cd6840/python/lsst/daf/persistence/butler.py", line 1211, in queryMetadata
          tuples = repoData.repo.queryMetadata(datasetType, format, dataId)
        File "/ssd/lsstsw/stack/Linux64/daf_persistence/13.0-20-g8cd6840/python/lsst/daf/persistence/repository.py", line 259, in queryMetadata
          ret = self._mapper.queryMetadata(*args, **kwargs)
        File "/ssd/lsstsw/stack/Linux64/daf_persistence/13.0-20-g8cd6840/python/lsst/daf/persistence/mapper.py", line 127, in queryMetadata
          val = func(format, self.validate(dataId))
        File "/home/rlupton/LSST/obs/base/python/lsst/obs/base/cameraMapper.py", line 382, in queryClosure
          return mapping.lookup(format, dataId)
        File "/home/rlupton/LSST/obs/base/python/lsst/obs/base/mapping.py", line 409, in lookup
          (columns, dataId, len(lookups)))
      RuntimeError: No unique lookup for set([u'date']) from DataId(initialdata={}, tag=set([])): 239 matches
      

        Attachments

          Activity

          Hide
          price Paul Price added a comment -

          I suspect this may be a mapper misconfiguration. Is this for ComCam?

          Show
          price Paul Price added a comment - I suspect this may be a mapper misconfiguration. Is this for ComCam?
          Hide
          swinbank John Swinbank added a comment -

          Assigning this to DAX (& paging Fritz Mueller) on the basis that this is tagged as a Butler issue, but perhaps Paul Price can expand on his comment — if it's a problem with pipelines code, we should debug it locally.

          Show
          swinbank John Swinbank added a comment - Assigning this to DAX (& paging Fritz Mueller ) on the basis that this is tagged as a Butler issue, but perhaps Paul Price can expand on his comment — if it's a problem with pipelines code, we should debug it locally.
          Hide
          price Paul Price added a comment -

          For master of obs_comCam, I see:

              bias: {
                  template:    "bias/%(calibDate)s/bias-%(calibDate)s.fits.gz"
          ...
                  columns:     "date"
                  obsTimeName: "date"
          ...
              }
          

          1. I think the template should include raft and ccd (or sensor or whatever), or at least some method of distinguishing between individual CCDs.
          2. I think the columns should also include raft and ccd (or sensor or whatever) — you need those in order to select a bias.
          3. I would be wary of using obsTimeName (instead of using the standard column name taiObs for the date) because the CameraMapper code may not be respecting it everywhere it needs to.
          4. Ditto with the other calibs.

          Show
          price Paul Price added a comment - For master of obs_comCam, I see: bias: { template: "bias/%(calibDate)s/bias-%(calibDate)s.fits.gz" ... columns: "date" obsTimeName: "date" ... } 1. I think the template should include raft and ccd (or sensor or whatever), or at least some method of distinguishing between individual CCDs. 2. I think the columns should also include raft and ccd (or sensor or whatever) — you need those in order to select a bias. 3. I would be wary of using obsTimeName (instead of using the standard column name taiObs for the date) because the CameraMapper code may not be respecting it everywhere it needs to. 4. Ditto with the other calibs.
          Hide
          rhl Robert Lupton added a comment -

          While the mapper should indeed include a ccd-specification, and may have other problems, I still think that this is a bug.

          I can look up the data using get; queryMetadata fails with a traceback saying that the result is ambiguous, although that's not supposed to be a problem – it should return the set of rafts that are in those 239 matches.

          Show
          rhl Robert Lupton added a comment - While the mapper should indeed include a ccd-specification, and may have other problems, I still think that this is a bug. I can look up the data using get ; queryMetadata fails with a traceback saying that the result is ambiguous, although that's not supposed to be a problem – it should return the set of rafts that are in those 239 matches.
          Hide
          rhl Robert Lupton added a comment -

          This bug is foiling my attempts to lookup missing-but-degenerate keys.

          I think that the problem may be related to this function:

              def need(self, properties, dataId):
                  """Ensures all properties in the provided list are present in
                  the data identifier, looking them up as needed.  This is only
                  possible for the case where the data identifies a single
                  exposure.
                  @param properties (list of strings) Properties required
                  @param dataId     (dict) Partial dataset identifier
                  @return (dict) copy of dataset identifier with enhanced values
                  """
                  newId = dataId.copy()
                  newProps = []                    # Properties we don't already have
                  for prop in properties:
                      if prop not in newId:
                          newProps.append(prop)
                  if len(newProps) == 0:
                      return newId
           
                  lookups = self.lookup(newProps, newId)
                  if len(lookups) != 1:
                      raise NoResults("No unique lookup for %s from %s: %d matches" %
                                      (newProps, newId, len(lookups)),
                                      self.datasetType, dataId)
                  for i, prop in enumerate(newProps):
                      newId[prop] = lookups[0][i]
                  return newId
          

          Note that self.lookup() is not called if properties is empty. Now, if you look at lookup you'll see:

              def lookup(self, properties, dataId):
                  """Look up properties for in a metadata registry given a partial
                  dataset identifier.
                  @param properties (list of strings)
                  @param dataId     (dict) Dataset identifier
                  @return (list of tuples) values of properties"""
           
          # Either look up taiObs in reference and then all in calibRegistry
          # Or look up all in registry
           
                  newId = dataId.copy()
                  if self.reference is not None:
                      where = []
                      values = []
                      for k, v in dataId.items():
                          if self.refCols and k not in self.refCols:
                              continue
                          where.append(k)
                          values.append(v)
           
                      # Columns we need from the regular registry
                      if self.columns is not None:
                          columns = set(self.columns)
                          for k in dataId.keys():
                              columns.discard(k)
                      else:
                          columns = set(properties)
          ...
          

          Note that we always lookup self.columns before looking up properties. In the case of biases, we usually have columns : taiObs and taiObs in general has a range of values – which throws an exception. The logic to lookup self.columns is not new, and is presumably needed during other parts of the calibs lookup, but in this case it's doing the wrong thing.

          Show
          rhl Robert Lupton added a comment - This bug is foiling my attempts to lookup missing-but-degenerate keys. I think that the problem may be related to this function: def need(self, properties, dataId): """Ensures all properties in the provided list are present in the data identifier, looking them up as needed. This is only possible for the case where the data identifies a single exposure. @param properties (list of strings) Properties required @param dataId (dict) Partial dataset identifier @return (dict) copy of dataset identifier with enhanced values """ newId = dataId.copy() newProps = [] # Properties we don't already have for prop in properties: if prop not in newId: newProps.append(prop) if len(newProps) == 0: return newId   lookups = self.lookup(newProps, newId) if len(lookups) != 1: raise NoResults("No unique lookup for %s from %s: %d matches" % (newProps, newId, len(lookups)), self.datasetType, dataId) for i, prop in enumerate(newProps): newId[prop] = lookups[0][i] return newId Note that self.lookup() is not called if properties is empty. Now, if you look at lookup you'll see: def lookup(self, properties, dataId): """Look up properties for in a metadata registry given a partial dataset identifier. @param properties (list of strings) @param dataId (dict) Dataset identifier @return (list of tuples) values of properties"""   # Either look up taiObs in reference and then all in calibRegistry # Or look up all in registry   newId = dataId.copy() if self.reference is not None: where = [] values = [] for k, v in dataId.items(): if self.refCols and k not in self.refCols: continue where.append(k) values.append(v)   # Columns we need from the regular registry if self.columns is not None: columns = set(self.columns) for k in dataId.keys(): columns.discard(k) else: columns = set(properties) ... Note that we always lookup self.columns before looking up properties . In the case of biases, we usually have columns : taiObs and taiObs in general has a range of values – which throws an exception. The logic to lookup self.columns is not new, and is presumably needed during other parts of the calibs lookup, but in this case it's doing the wrong thing.
          Hide
          tjenness Tim Jenness added a comment -

          Is this ticket still relevant?

          Show
          tjenness Tim Jenness added a comment - Is this ticket still relevant?

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            rhl Robert Lupton
            Watchers:
            John Swinbank, Merlin Fisher-Levine, Paul Price, Robert Lupton, Tim Jenness
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated: