Details
Description
I understand that there's some thinking going on about possible rearrangement of the components of the prototype TAP browser UI, and in this context I heard a remark that the upper-left pane, that is, the schema-and-table selection pane, has some blank space that we could think about reclaiming.
This is to point out that at some point we'll want to facilitate joins in the UI, based on the foreign-key information in TAP_SCHEMA.
The currently empty space below the "Tables" pull-down would be exactly the place for this. I imagine that below that pull-down there would be UI elements for selecting a table to join to and the keys to use for the join.
I've started a sketch for that which I'll upload as an image.
Once the join is defined, Firefly would have to determine what the joined tables' effective schema would be, either by analyzing the schemas of the two tables and computing the combination itself, or by issuing a MAXREC=0 query for the join against the TAP service. With that in hand, the UI's table-of-columns-and-constraints (the current bottom half) could be populated with the joined schema, and then all its other features (e.g., column selection, constraint definition) would proceed as usual. The spatial-constraint and temporal-constraint search helpers (upper right pane) should also be able to work against the joined schema. This is all pretty well-defined and users would have very natural expectations.
Getting the joined-table schema itself defined may be the hardest part of this whole task.
Implementing this is NOT an LSST close-out task (sadly). But we should not give the UI space away without at least taking this into account.
The sketch I've attached shows a maximalist fantasy, where the UI supports both by-key and spatial ("fuzzy") joins, with that choice under the control of radio buttons.
1) The by-key joins, which would be easier to implement and my higher priority, would offer tables eligible to be joined to the "primary" table that was selected, based on the content of the TAP_SCHEMA.keys table, which describes which tables have foreign-key links to each other.
Once a second table is chosen, there will be one or more actual key pairs that can be used for the join, as defined in the TAP_SCHEMA.key_columns table. The final drop-down allows the user to make the choice of these, if there is a choice available.
(I didn't show this in my sketch, but on reflection the "Upload" function I sketched for the spatial join also makes sense for by-key joins. However, here, without TAP_SCHEMA.key_columns metadata available, the user would have to be offered the choice of any columns in either table to be used to perform the foreign-key join. That would require two key drop-downs, instead of a single key-pair drop-down.)
2) For spatial joins, the first step would be to define a target table. In this case it will be extremely common for the user to want this to be a transiently uploaded private table, so I showed an upload button.
Once the two tables have been defined, a fairly complex UI that probably will not fit in this pane would be needed to define the columns in each table that would be used for the spatial ("neighborhood") join. Some of the logic of the existing spatial-constraints helper would be appropriate here, as the job would include identifying columns that are "ra-like" and "dec-like" to be used in the spatial distance calculation.
It might also be useful to be able to perform one-dimensional "fuzzy joins", e.g., on time.
All of this second mode requires a lot of careful thought. I would recommend deploying a version of the TAP explorer with the by-key joins first, and then add the spatial joins in a further round of work. Support of joins to user tables in the by-key joins could/should also be deferred.