Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-651

Explicitly add TAP UPLOAD (temporary-table-upload-and-join) to the LSP DAX and database requirements

    XMLWordPrintable

    Details

      Description

      It has come to my attention that we under-documented a planned capability of the LSP, and I would like to add corresponding requirements to LDM-554. If this is unacceptable because of the present scope concerns, I would like to at least add tickets that recognize the existence of the specific work needed.

      The capability is the one in the TAP and ADQL standards that involves the ability to upload an explicitly ephemeral table to the TAP service for use in a single query. This is distinct from the "user database workspace" which is foreseen in the LSP requirements, in which a user can create their own persistent tables, "MyDB"-style.

      The relevant references are:

      This capability is used, for instance, when a user has a list of N objects and wishes to perform cone searches around every one of them in a single bulk operation.

      Many legacy pre-TAP query interfaces have this capability (e.g., including the legacy IRSA query services). It is important to provide in order to reduce the likelihood that users will submit thousands of trivial queries instead of batching them up.

      There is a Portal Aspect requirement, DMS-PRTL-REQ-0021, to support an interface to such queries, and the Discussion for this requirement states:

      Efficient implementation of list-based queries requires a corresponding API aspect / Data Access Web API service, to avoid the submission of large numbers of separate queries.

      (Firefly supports such queries, though the capability was not exposed in the TAP query UI due to the absence of the underlying feature in the LSST TAP service.)

      Temporary-table-upload queries are already supported by PyVO:

      and are therefore immediately germane to the Notebook Aspect environment as well. They could easily be used to perform functions like querying for a set of light curves for multiple objects in a single operation, or to perform cone searches around a user's list of favorite AGNs, etc.

      Due largely to an oversight in the LSP requirements-generation process, the precisely corresponding DAX requirement was never included in LDM-554.

      There is a related requirement, DMS-API-REQ-0032, which states:

      The API Aspect shall provide a capability for users to upload catalog data products (formatted as VOTables) residing within their allocated VOSpace, such that the catalog products after upload may be joined in queries against data release catalog products, subject to limitations of a resource quota system.

      but this has been interpreted (including by me) as referring to the persistent User Database Workspace functionality.

      I would like to suggest that we accept the following requirement, as a child of DMS-API-REQ-0006 "TAP Service for Tabular Queries":

      DMS-API-REQ-xxx1 "TAP service temporary table upload":

      Specification: The API Aspect TAP service shall support the standard UPLOAD parameter for the use of temporary, user-uploaded tables in ADQL expressions.  Such temporary tables shall be able to be joined (including both ID-equality and spatial joins) against the principal LSST catalog data products.

      Discussion: This requirement is distinct from requirements for a User Database Workspace for persistent, user-created databases.

      As part of the discussion of this RFC, we should determine whether this needs to be expressed more clearly in the database requirements, LDM-555, as well. The existing requirement DMS-DB-REQ-0014 "Cross-matching with external/user data" could be pressed into service to support both temporary tables and the User Database Workspace, but it's fairly vague:

      Users shall be able to cross-match the LSST catalogs with external catalogs. Some catalogs shall be provided by LSST, whilst other catalogs can be uploaded by the user. Results from these cross-matches can be used in subsequent queries.

        Attachments

          Issue Links

            Activity

            Hide
            ctslater Colin Slater added a comment -

            I'm ambivalent about this. Since the requirements already include non-ephemeral table upload, we know that users won't be left high-and-dry without an ability to use their own tables, though it might take an extra step or two compared to the proposed ephemeral upload. Given that existing situation, it seems like all this requirement can do is kind of tie our own hands when deciding prioritization in the future. It's certainly a feature that I'd be happy to have, but I'd rather consider it in comparison to other potential next-steps in the development process as we go along than mandate it now.

             

            BTW, do we know if the CADC TAP service already supports this?

            Show
            ctslater Colin Slater added a comment - I'm ambivalent about this. Since the requirements already include non-ephemeral table upload, we know that users won't be left high-and-dry without an ability to use their own tables, though it might take an extra step or two compared to the proposed ephemeral upload. Given that existing situation, it seems like all this requirement can do is kind of tie our own hands when deciding prioritization in the future. It's certainly a feature that I'd be happy to have, but I'd rather consider it in comparison to other potential next-steps in the development process as we go along than mandate it now.   BTW, do we know if the CADC TAP service already supports this?
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            Yes, CADC supports this.

            I don't think non-ephemeral upload is a satisfactory substitute. There is no IVOA standard for how to do it, so "generic" clients like TOPCAT, PyVO, and Firefly (unless LSST resumes development) will not be able to take advantage of it. In the absence of UPLOAD facilities, generic clients will typically fall back to issuing a long list of single-row queries, which will substantially increase the transaction load on our servers.

            Furthermore, non-ephemeral upload is subject to a user's personal space quota, whereas ephemeral upload is able to support large temporary tables precisely because it knows that it doesn't have to keep them around.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - Yes, CADC supports this. I don't think non-ephemeral upload is a satisfactory substitute. There is no IVOA standard for how to do it, so "generic" clients like TOPCAT, PyVO, and Firefly (unless LSST resumes development) will not be able to take advantage of it. In the absence of UPLOAD facilities, generic clients will typically fall back to issuing a long list of single-row queries, which will substantially increase the transaction load on our servers. Furthermore, non-ephemeral upload is subject to a user's personal space quota, whereas ephemeral upload is able to support large temporary tables precisely because it knows that it doesn't have to keep them around.
            Hide
            ctslater Colin Slater added a comment -

            Ok, if that's situation we have with the TAP standard at the moment, then I support adding this. Thanks for the explanation.

            Show
            ctslater Colin Slater added a comment - Ok, if that's situation we have with the TAP standard at the moment, then I support adding this. Thanks for the explanation.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            Planned end was yesterday. Given the discussion above, I will proceed to implement the above-proposed requirement text in MagicDraw and produce a draft of LDM-554 for the CCB to consider.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - Planned end was yesterday. Given the discussion above, I will proceed to implement the above-proposed requirement text in MagicDraw and produce a draft of LDM-554 for the CCB to consider.
            Hide
            womullan Wil O'Mullane added a comment -

            since you are doing the other change anyway..

             

            Show
            womullan Wil O'Mullane added a comment - since you are doing the other change anyway..  
            Hide
            tjenness Tim Jenness added a comment -

            Gregory Dubois-Felsmann do you have a timeline for a draft document? Can we change the planned end?

            Show
            tjenness Tim Jenness added a comment - Gregory Dubois-Felsmann do you have a timeline for a draft document? Can we change the planned end?
            Hide
            tjenness Tim Jenness added a comment -

            The CCB discussed this today and given that we have three LDM-554 changes being considered we are happy to process these tickets in parallel without requiring the MagicDraw updates which seem to be blocking progress.

            Show
            tjenness Tim Jenness added a comment - The CCB discussed this today and given that we have three LDM-554 changes being considered we are happy to process these tickets in parallel without requiring the MagicDraw updates which seem to be blocking progress.

              People

              Assignee:
              gpdf Gregory Dubois-Felsmann
              Reporter:
              gpdf Gregory Dubois-Felsmann
              Watchers:
              Christine Banek, Colin Slater, Fritz Mueller, Frossie Economou, Gregory Dubois-Felsmann, Leanne Guy, Tim Jenness, Wil O'Mullane
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Planned End:

                  Jenkins

                  No builds found.