Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-18700

Devise use cases and possible methods for associating documentation, examples, etc. with data in a TAP service

    Details

    • Type: Story
    • Status: To Do
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: Design Documents
    • Labels:
    • Templates:
    • Story Points:
      8
    • Team:
      Architecture

      Description

      (This is a "theory ticket" meant to lay groundwork for possible future feature requests in Firefly and/or proposals to the IVOA. It is potentially relevant to LSST and all the IPAC archives.)

      A few starting points for use cases:

      • Providing links to table-level documentation on tables in a TAP service.
      • More generally: TAP_SCHEMA already provides for a "name" and a "description" for all the entities it covers: schemas, tables, columns, and foreign-key relationships. In a UI it might be useful to have available additional related information: mouseover text - which might be of a length intermediate between the name and the description - and one or more URLs for further information (e.g., a natural-language description of the entire table, a data quality report on the table, a curated plot of information on a column, an explanation of the meaning of a foreign-key link, etc.).
      • Various standards such as ObsCore and CAOM2 envision category columns, either ones that are completely defined by the standard such as ObsCore's dataproduct_type and calib_level, or ones that are defined by the data publisher, such as dataproduct_subtype and obs_collection. Of necessity the values in these columns are generally short strings or even integers, and the standards do not provide a pre-defined place in the TAP table data model for additional information. However, it would be extremely useful for client tool implementers, principally UI tools, but even programmatic ones, to be able to access additional information - e.g., a human-readable "description" field and a documentation URL corresponding to each value of obs_collection, or a collection-specific set of documentation on the meanings of the calib_level values in the context of a particular dataset's pipeline processing.
      • To support the usability of a TAP query client tool, it may be useful to provide executable/clickable example text for constraints values and expressions on specified columns. See the CADC advanced-query tool for the CAOM2 data model for an elegant example of the user value of such a system. The CADC tool's implementation is greatly simplified, however, by only having to deal with a single well-defined data model that the tool-provider itself understands and publishes. Currently there is no way that a generic TAP-query tool could provide similarly rich contextual help across multiple archives.

      This ticket asks that these and perhaps other more concrete use cases be fleshed out (which may involve collating information previously generated and published by others in the IVOA and other fora) and that one or more implementation models be sketched.

      Implementation models might involve one or more of the following:

      • Use of existing IVOA mechanisms such as <FIELD> and <OPTION> metadata in VOTable (with the accompanying implication that a data publisher's TAP service would have to know where to find the necessary metadata). In part this would require that this sort of metadata be attached to VOTable results on TAP_SCHEMA queries.
      • Use of TAP's ability to document foreign-key relationships to provide joinable tables to be used in conjunction with the standard TAP_SCHEMA tables, to provide additional metadata on the entities described by TAP_SCHEMA. In this model the TAP server itself would not need to know about the documentation mechanisms envisioned.
      • Use of DataLink to provide links to additional information. Standardized service descriptors could be developed to be used in conjunction with TAP_SCHEMA tables, for instance.

      The implementation sketches should consider pros and cons associated with the different approaches, including the likelihood of reaching interoperable agreements on these matters.

      This work is meant to be TAP-focused. At some point it might be useful to understand how it could be extended to SIAv2 queries. A possible start for the implementation sketch here would be to think about VOTable and DataLink information that might be returned with an SIAv2 MAXREC=0 query against a service. The analysis should consider whether it's possible to share some or all of the same mechanism with a TAP-based model.

      If time permits it would be very useful for a first version of such an analysis to be ready in time for presentation at the 2019 IVOA Northern Spring InterOp.

        Attachments

          Activity

            People

            • Assignee:
              gpdf Gregory Dubois-Felsmann
              Reporter:
              gpdf Gregory Dubois-Felsmann
              Watchers:
              Brian Van Klaveren, Christine Banek, Gregory Dubois-Felsmann, Jonathan Sick, Serge Monkewitz, Trey Roby, Vandana Desai
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:

                Summary Panel