Status: In Progress
Fix Version/s: None
Write up a comment for
RFC-484 that compares Jim's Gen3 Butler schema (dmtn-073.lsst.io) with CAOM and documents anything that should be changed to align them.
Due 2018-06-19, but any major inconsistencies should be identified by 2018-06-01.
- links to
- mentioned in
"Intent" in terms of which scheduler "proposal" "asked" for the image to be taken will come from the EFD, as will a bevy of ambient temperature readings. If there's one particular temperature that we should provide, we can arrange for that to be generated by the EFD Transformation.
I don't think coadds are or need to be Observations for CAOM at all.
Create View Observation as Visit JOIN Exposure JOIN Dataset JOIN DatasestCollection
Do I understand correctly that to create a view for SimpleObservation, you'd have no GROUP BY here, while for CompositeObservation, you would GROUP BY Visit and somehow aggregate snap-level quantities from Exposure? If so, I don't actually explain why Exposure can't be an Observation, but I'm willing to accept that if it means I don't need to educate myself about CAOM2 in detail now.
Would it be a problem if we didn't have a CAOM2 representation of calibration Exposures? I was intending to create Visits for those.
It would be necessary to define how the Visits/Snaps join to the EFD to determine this.
The Gen3 schema is permitted to also include additional per-Visit or per-Exposure tables that are specific to a particular Camera. I think I'd advocate for putting LSST-specific values there, and populating those tables from the EFD at or around raw data ingest.
I'm a little confused by the conversation here and I suggest that Brian Van Klaveren contacts Pat Dowler to clarify some concepts. In my previous telescope we used CAOM2 and it was fully able to represent multiple exposures within a single observation (even if they were at different wavelengths), data products derived from a single observation (e.g a PVI of a visit), and coadds combining multiple observations.
Gregory Dubois-Felsmann you may have missed this ticket in your ObsCore musings.
Gregory Dubois-Felsmann do you feel that the ObsCore compliance of raws in gen3 registry is sufficient to allow this ticket to be shut down?
For CAOM2, the main sort-of problem I see is that it appears Exposures can't be (easily) grouped as a full focal plane observation without a Visit defined, so it appears there's no way to logically group a "snap". That said, that just means an Exposure is not analogous to an Observation in CAOM2, and especially if a Visit includes a single snap, that means a Visit itself, as defined in the Visit Table, can be both a CompositeObservation (e.g. 2x15s Snaps) and a SimpleObservation (e.g. 1x30s Snap). That's probably fine, though it seems there's no easy way of determining which Observation a Visit is without attempting to do a join to the Exposure table and count the unique Snaps. You have to do a join throught the Exposure table in any case (with a few more) to actually determine the equivalent of the Collection field in the CAOM2 Observation. One major question is intent, though I think "intent" in CAOM2 speak is really implicit in the Gen3 definition of a Dataset, as is type and metaRelease.
In terms of defining an Observation in CAOM via a Gen3 schema, we need to approximately execute the following:
Create View Observation as Visit JOIN Exposure JOIN Dataset JOIN DatasestCollection (+ a few extras)
We don't have, within the Gen3 Schema, a way of determining the algorithm used to pick the Snap (scheduler?), nor do we have the environmental information (e.g. Ambient Temperature). It would be necessary to define how the Visits/Snaps join to the EFD to determine this.
CAOM2 definition for Instrument is roughly equivalent to our definition for Camera. Again, a few joins would be necessary to represent this.
I think a Plane is roughly equivalent to a Combination of Visit, Exposure, and Dataset - though filter information would need to be converted into Energy information about the Observation/Visit, though it may also be a applicable for coadded images.
I'm not quite sure how Coadded images/multi-camera observations are easily represented either in the gen3 (are they a logical visit?) or caom2 (though I think they are modeled as multiple planes for an observation, but then it appears the 1:1 relation of attributes of Observation falls apart, e.g. Environment). A full sky image might just be a single CompositeObservation as well.
The artifacts attribute of a plane is roughly equivalent to the Dataset. One tricky thing is that the "access URL" would likely need to be computed from attributes of a Dataset and translated to an imgserv URL, for example, if we were modeling this as a View. That's not so bad, but it might mean we have multiple views in a database defining CAOM2 Tables (views) for each service instance. That's also not terrible - implementation wise we'd want to probably create a user/tablespace for each individual imgserv service we deploy, which makes sure to materialize access URLs that are relevant for that particular service.
In short, I think Gen3 is fine for representing PVIs in CAOM2, and can mostly be executed by performing joins. Implementation wise, we'll want those joins to be fast but I believe the Foreign Key relations will make sure to have indices for most of those so that will be fine. I'm not fully sure how we model coadds/full sky images/multi-camera/etc... except to model them as a single composite observation and drop most the attributes of an "Observation" on the floor (we need to investigate prior art, if any). I think we'd potentially need another table in order to represent that. It's probably not easy to wholly adopt CAOM2 for butler gen3, especially as the CAOM2 nearly requires a materialize URL for accessing the data, and we don't have a single service defined to implement that.