Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-17472

Add additional TAP_SCHEMA data to the Felis file format

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: Data Access
    • Labels:
    • Team:
      Architecture

      Description

      The TAP_SCHEMA data model for the schema of a set of data exposed via TAP includes a variety of elements that are intended to provide hints to user interfaces for how the organization of the data should be presented in interfaces. These include:

      For "schemas"

      In TAP_SCHEMA.schemas, the description and schema_index attributes for a "schema" (in the TAP sense) affect the UI presentation. schema_index, in particular, defines a suggested order of presentation of the "schemas" in a user interface.

      For tables

      In TAP_SCHEMA.tables, the description and table_index attributes for a table. From the standard:

      Clients may order by table_index (ascending) so lower index tables would appear earlier in a listing.

      For columns

      In TAP_SCHEMA.columns, the description, principal, and column_index attributes for a table. From the standard:

      The principal, indexed, and std columns are boolean values implemented as integers. As such, the value must be 0 or 1; no other values are allowed.

      The principal flag indicates that the column is considered a core part of the content; clients can use this hint to make the principal column(s) visible, for example by selecting them by default in generating an ADQL query. In cases where the service selects the columns to return (such as a query language without an explicit output selection), the principal column indicates those columns that are returned by default.

      (...)

      The column_index is used to recommend column ordering for clients. Clients may order by column_index (ascending) so lower index columns would appear earlier in a listing. This is useful for keeping related columns together in output or display.

      Note that there is a single TAP_SCHEMA.columns table for all the tables on a TAP service. So all the column_index values appear in a single table. However, they are primarily only meaningful in the context of the display of attributes of a single table, so it is OK if columns from different tables share a column_index value.

      In general

      For the *_index attributes, as they have to define an ordering, it may not be appropriate for them to be sourced "locally" with the definitions of the tables and columns in the Felis source files. Most likely the system that takes Felis definitions and produces TAP_SCHEMA content from them needs an additional input source that can provide the ordering "after the fact".

      (The following is not part of the task of this ticket, it's here just for context.) We should establish a pattern for the description elements - an approximate length range for the content, for instance, and a standard for their character representation (UTF-8?) and markup, if any.

        Attachments

          Issue Links

            Activity

            Hide
            bvan Brian Van Klaveren added a comment -

            There's already support for adding:

            Schema object:
            tap:schema_index

            Table object:
            tap:table_index

            Column Object:
            tap:principal
            tap:std
            tap:column_index

            These will be properly propagated to TAP_SCHEMA by load-tap.

            There's no need to use a tap:description because we already have a description that will also fill that field. I can add support for a tap:description that will override a description on a Schema/Table/Column, the same way you can override a datatype with a mysql:datatype, for example.

            I will consider this ticket a request to update the documentation. I was trying to see if some of the standardization of the vocabulary needs to be something IVOA agrees on. There wasn't that much interest at the last IVOA meeting, and I'd been working on auth so I punted on it. I think, as defined, the properties are reasonable though. If you think we should also have an extra tap:description I can add it as well, but I think we should use description where possible.

            Show
            bvan Brian Van Klaveren added a comment - There's already support for adding: Schema object: tap:schema_index Table object: tap:table_index Column Object: tap:principal tap:std tap:column_index These will be properly propagated to TAP_SCHEMA by load-tap . There's no need to use a tap:description because we already have a description that will also fill that field. I can add support for a tap:description that will override a description on a Schema/Table/Column, the same way you can override a datatype with a mysql:datatype , for example. I will consider this ticket a request to update the documentation. I was trying to see if some of the standardization of the vocabulary needs to be something IVOA agrees on. There wasn't that much interest at the last IVOA meeting, and I'd been working on auth so I punted on it. I think, as defined, the properties are reasonable though. If you think we should also have an extra tap:description I can add it as well, but I think we should use description where possible.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment - - edited

            I was not requesting an extra tap:description; it's fine to use the main description for this.

            Updating the documentation would definitely be welcome.

            I think it is acceptable to close this ticket after that documentation is provided, but in that case I would like Brian Van Klaveren to create a follow-on ticket for the "graph merging" that he mentioned in an Architecture team discussion on March 27th. I do think that we'll have to have a way to set values like table_index "later" in the data release flow process than the creation of the table schemas themselves.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - - edited I was not requesting an extra tap:description ; it's fine to use the main description for this. Updating the documentation would definitely be welcome. I think it is acceptable to close this ticket after that documentation is provided, but in that case I would like Brian Van Klaveren to create a follow-on ticket for the "graph merging" that he mentioned in an Architecture team discussion on March 27th. I do think that we'll have to have a way to set values like table_index "later" in the data release flow process than the creation of the table schemas themselves.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            I think there was also an alternative suggestion that the "merge" be done by making the TAP_SCHEMA tables be views that join the "original" Felis-derived metadata with post-hoc table_index-type metadata?

            Perhaps an Architecture discussion is needed to resolve this.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - I think there was also an alternative suggestion that the "merge" be done by making the TAP_SCHEMA tables be views that join the "original" Felis-derived metadata with post-hoc table_index -type metadata? Perhaps an Architecture discussion is needed to resolve this.
            Hide
            cbanek Christine Banek added a comment -

            Please don't try to do views on TAP_SCHEMA.  This just seems like a lot of work for an extra column that should be available.  The TAP_SCHEMA tables will also be the most queried tables of the whole database, since the TAP service checks TAP_SCHEMA for every query. I think we should just all have it in the same place unless there is some real blocker on doing it that way.

            Show
            cbanek Christine Banek added a comment - Please don't try to do views on TAP_SCHEMA.  This just seems like a lot of work for an extra column that should be available.  The TAP_SCHEMA tables will also be the most queried tables of the whole database, since the TAP service checks TAP_SCHEMA for every query. I think we should just all have it in the same place unless there is some real blocker on doing it that way.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            Pinging this ticket.

            After playing with what we have a bit, I feel comfortable with originating the column_index and principal values for each table as part of the Felis description for the table itself.

            But I still feel like schema_index and table_index values will need to be editable logically after the moment when individual tables' Felis schemas are set. This information is likely to come from someone who has the point of view of curating the entire user experience of the service, rather than the content of individual tables.

            So I think we still need to devise a workflow that loads TAP_SCHEMA from a combination of column-level metadata from Felis for individual tables with schema- and table-level metadata that is provided "later" and in a more centralized way.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - Pinging this ticket. After playing with what we have a bit, I feel comfortable with originating the column_index and principal values for each table as part of the Felis description for the table itself. But I still feel like schema_index and table_index values will need to be editable logically after the moment when individual tables' Felis schemas are set. This information is likely to come from someone who has the point of view of curating the entire user experience of the service, rather than the content of individual tables. So I think we still need to devise a workflow that loads TAP_SCHEMA from a combination of column-level metadata from Felis for individual tables with schema- and table-level metadata that is provided "later" and in a more centralized way.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            I'd like to close this ticket, because most of it is/was already in Felis, and open a new ticket that much more narrowly addresses the question of curation of schema_index (especially) and table_index.

            Brian Van Klaveren, you agree this is all in current Felis apart from that, with the recent bug fix in DM-29939?

            Show
            gpdf Gregory Dubois-Felsmann added a comment - I'd like to close this ticket, because most of it is/was already in Felis, and open a new ticket that much more narrowly addresses the question of curation of schema_index (especially) and table_index . Brian Van Klaveren , you agree this is all in current Felis apart from that, with the recent bug fix in DM-29939 ?
            Hide
            bvan Brian Van Klaveren added a comment -

            Yeah, closed.

            Show
            bvan Brian Van Klaveren added a comment - Yeah, closed.

              People

              Assignee:
              bvan Brian Van Klaveren
              Reporter:
              gpdf Gregory Dubois-Felsmann
              Reviewers:
              Brian Van Klaveren
              Watchers:
              Brian Van Klaveren, Christine Banek, Colin Slater, Fritz Mueller, Gregory Dubois-Felsmann, Tatiana Goldina
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.