Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-22496

Provide an interim definition of the obs_publisher_did field in ObsCore that is usable for Gen3-based image data

    XMLWordPrintable

Details

    • Story
    • Status: In Progress
    • Resolution: Unresolved
    • None
    • Design Documents
    • None
    • Architecture

    Description

      In order to facilitate the independent development of both:

      1. ObsCore-compliant image metadata services (ObsTAP and SIAv2); and
      2. a SODA service

      we need a near-term-usable definition of how to construct ObsCore obs_publisher_did values for data in a Gen3 repository.

      "Near-term" is meant to mean at a minimum "expected to work for Gen3-based HSC datasets" and usable for them for the DP0 data release and the Spring 2020 LSP testing round.

      It would probably be good to have an idea for how this would also work for LSST simulated images (e.g., from DESC DC2), should these end up being part of DP0 but this is not a requirement for this ticket.

      Nor is it a requirement for this ticket that either

      1. we devise a solution that is relevant to the LSST operations era or
      2. the proposed solution continue to work without change throughout the rest of construction and commissioning.

      It is OK if a redesign turns out to be needed somewhere along the way. It is more important that we get this off the ground.

      The deliverable from this ticket is a written spec, not code.

      Attachments

        Issue Links

          Activity

            I will post some more about this by tomorrow to clarify what the issues are.

            At a very crude level, though, I believe what is needed is a way to represent the triple of ( BG3 repo, DataId, DatasetType ).

            gpdf Gregory Dubois-Felsmann added a comment - I will post some more about this by tomorrow to clarify what the issues are. At a very crude level, though, I believe what is needed is a way to represent the triple of ( BG3 repo, DataId, DatasetType ).
            kennylo Kenny Lo added a comment -

            That's what I was thinking as well.  Instead of treating the 3 separately, as in legacy Imgserv, the SODA for Butler Gen3 must start treating the triples as the new Image ID, in LSST context.

            kennylo Kenny Lo added a comment - That's what I was thinking as well.  Instead of treating the 3 separately, as in legacy Imgserv, the SODA for Butler Gen3 must start treating the triples as the new Image ID, in LSST context.
            tjenness Tim Jenness added a comment -

            I believe we decided in the SODA discussion that the best way to reference a dataset is to use the butler registry UUID for it and not try to store a tuple of dataId, DatasetType, and run collection. The UUID should in theory be portable to other registries so does not need a butler repository root URI either.

            tjenness Tim Jenness added a comment - I believe we decided in the SODA discussion that the best way to reference a dataset is to use the butler registry UUID for it and not try to store a tuple of dataId, DatasetType, and run collection. The UUID should in theory be portable to other registries so does not need a butler repository root URI either.

            People

              Unassigned Unassigned
              gpdf Gregory Dubois-Felsmann
              Andy Salnikov, Colin Slater, Fritz Mueller, Frossie Economou, Gregory Dubois-Felsmann, Kenny Lo, Kian-Tat Lim, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:

                Jenkins

                  No builds found.