Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-6912

Please add a way to interpret opaque sourceId in terms of visit/tract/patch/ccd/...

    XMLWordPrintable

    Details

    • Type: Story
    • Status: To Do
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Team:
      Architecture

      Description

      The mappers provide a way to pack useful information into sourceIds (e.g. visit/raft/chip or tract/patch/filter), but no way to do the unpacking.

      Please provide a unified interface that all mappers realise to do this – my trial implementations use `splitId` and I provide a couple of examples below. If the way that the packing is done is standardised (in terms of variables of numbers of bits per field this might be done just once in some base Mapper.

      As implied, you need to know a bit about where the sourceId came from (visit? patch?), and this is something that the user knows as they got the data from the butler; this implies that this function should be findable by the same name (including butler aliases etc.)

      We also need a corresponding routine to unpack the exposureId

      My implementation uses a MapperInfo class, but putting it directly into the mapper would probably be preferable. Note that `splitId` accepts numpy arrays (or afwTable columns), and optionally returns a dict; both features have proved useful.

             @staticmethod
             def splitId(oid, asDict=False):
                 """Split an ObjectId into visit, raft, sensor, and objId"""
                 objId = int((oid & 0xffff) - 1)     # Should be the same value as was set by apps code
                 oid >>= 16
                 raftSensorId = oid & 0x1ff
                 oid >>= 9
                 visit = int(oid)
       
                 raftId, sensorId = int(raftSensorId//10), int(raftSensorId%10)
                 raft = "%d,%d" % (raftId//5, raftId%5)
                 sensor = "%d,%d" % (sensorId//3, sensorId%3)
       
                 if asDict:
                     return dict(visit=visit, raft=raft, sensor=sensor, objId=objId)
                 else:
                     return visit, raft, sensor, objId
      

      Here's an example that splits a coaddId from HSC (there may be some irrelevant details in here, I hope not...). The mapper is the mapper in use (his is a static method of HscMapperInfo)

              @staticmethod
              def splitCoaddId(oid, asDict=True, hasFilter=True):
                  """Split an ObjectId (maybe an numpy array) into tract, patch, [filter], and objId.
                  See obs/subaru/python/lsst/obs/hscSim/hscMapper.py"""
                  mapper = HscMapperInfo.Mapper
       
                  try:
                      oid[0]
                  except TypeError:
                      oid = [oid]
       
                  oid = np.array(oid, dtype='int64')
                  objId = np.bitwise_and(oid, 2**mapper._nbit_id - 1)
                  oid >>= mapper._nbit_id
       
                  if hasFilter:
                      filterId = np.bitwise_and(oid, 2**mapper._nbit_filter - 1).astype('int32')
                      oid >>= mapper._nbit_filter
       
                      filterName = np.empty(oid.size, "a6")
       
                      if filterId.size == 1:
                          filterId = [int(filterId)] # as you can't iterate over a length-1 np array
       
                      for fid in set(filterId):
                          name = afwImage.Filter(int(fid)).getName()
       
                          filterName[filterId == fid] = name
                  else:
                      filterName = None
       
                  patchY = np.bitwise_and(oid, 2**mapper._nbit_patch - 1).astype('int32')
                  oid >>= mapper._nbit_patch
                  patchX = np.bitwise_and(oid, 2**mapper._nbit_patch - 1).astype('int32')
                  oid >>= mapper._nbit_patch
                  add = np.core.defchararray.add # why isn't this easier to find?
                  patch = add(add(patchX.astype(str), ","), patchY.astype(str))
                  patch.shape = patchY.shape # why do I have to do this?
       
                  tract = oid.astype('int32')
       
                  if oid.size == 1:     # sqlite doesn't like numpy types
                      if filterName:
                          filterName = str(filterName[0])
                      tract = int(tract)
                      patch = str(patch[0])
                      objId = int(objId)
       
                  if asDict:
                      return {"filter" : filterName, "tract" : tract, "patch" : patch, "objId" : objId}
                  else:
                      return filterName, tract, patch, objId
      

      (this is called splitCoaddId, but there's another level of indirection to call it as appropriate)

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            source ID management is still an open question that has to be solved at some point. It won't involve gen2 mappers though.

            Show
            tjenness Tim Jenness added a comment - source ID management is still an open question that has to be solved at some point. It won't involve gen2 mappers though.
            Hide
            swinbank John Swinbank added a comment -

            Assigning this to Tim, as Middleware Manager.

            Show
            swinbank John Swinbank added a comment - Assigning this to Tim, as Middleware Manager.
            Hide
            tjenness Tim Jenness added a comment -

            Jim Bosch do you have a plan for this in gen3?

            Show
            tjenness Tim Jenness added a comment - Jim Bosch do you have a plan for this in gen3?
            Hide
            jbosch Jim Bosch added a comment -

            No complete plan, but:

            • A static method on afw.table.IdFactory is probably the appropriate place to put the code that splits the source ID into a per-data-ID autoincrement integer and the packed data ID integer, because that's the class that mangles those two together.
            • There is already a daf.butler.DimensionPacker.unpack method to convert the packed data ID integer back into a DataCoordinate (i.e. a mapping). But of course we have the issue that this only works if one used daf.butler.DimensionPacker.pack to pack original DataCoordinate into that integer, and we also have packing code in astro_metadata_translator (and while I think those happen to do the same thing right now, at least for HSC, it'd be easy for them to get out of sync).
            • High-level code that puts the two of those together probably belongs in meas_base somewhere.
            Show
            jbosch Jim Bosch added a comment - No complete plan, but: A static method on afw.table.IdFactory is probably the appropriate place to put the code that splits the source ID into a per-data-ID autoincrement integer and the packed data ID integer, because that's the class that mangles those two together. There is already a daf.butler.DimensionPacker.unpack method to convert the packed data ID integer back into a DataCoordinate (i.e. a mapping). But of course we have the issue that this only works if one used daf.butler.DimensionPacker.pack to pack original DataCoordinate into that integer, and we also have packing code in astro_metadata_translator (and while I think those happen to do the same thing right now, at least for HSC, it'd be easy for them to get out of sync). High-level code that puts the two of those together probably belongs in meas_base somewhere.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              rhl Robert Lupton
              Watchers:
              Gregory Dubois-Felsmann, Hsin-Fang Chiang, Jim Bosch, John Parejko, Kian-Tat Lim, Robert Lupton, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:

                  Jenkins

                  No builds found.