Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: System Integration and Test
-
Labels:
-
Team:Architecture
Description
Informally we agree to handle data coming from Camera Calibration Optical Bench (CCOB) as part of early integration. To make this more formal we need to place a requirement on DM, in LSE-61 to be capable of archiving, and performing a meaningful processing on, data from Camera I&T data collection on the CCOB.
Attachments
Issue Links
- relates to
-
DM-13073 Add CCOB Milestone test in LDM-503
- Done
- links to
Activity
I don't think it has to be automated in the full sense of, say, Operations-era Prompt Processing.
The real action here is for someone (from Science Pipelines) to configure an appropriate pipeline, and someone from a "DM Early Operations" group (whatever that means in practice) to run that pipeline fairly routinely on data from the CCOB.
I'd be surprised if it were appropriate to process all CCOB data; most likely there will only be certain configurations that are worth processing. Some input from Robert Lupton might be helpful to sharpen this up.
The requirement that I'm interested in is that we can run DM analysis code on archived CCOB data. This requires that it appear somewhere (NCSA?) and be properly ingested into a repository; that we be able to instantiate a butler that points to that repository; and that DM be able to process that data in an way that could be automated (i.e. currently by issuing a command from the shell that processes all or a subset of the data; soon (I presume) that the processing be coordinated by SuperTask).
Initially the processing will be running the DM-stack enabled version of the Camera eotest scripts that Merlin Fisher-Levine has been working on.
Something like: ?
DMS-REQ-????? Archiving of CCOB data
Specification: The Archive Centre shall be capable of archiving CCOB data and making it available to Science Pipelines users via the standard data access interfaces.
DMS-REQ-????? Processing of CCOB data
Specification: The Science Pipelines shall be able to process CCOB data in a similar manner to standard LSST data.
I think something needs to be said about timelines; this is a requirement on the DM System prior to the completion of Construction.
With regard to the second requirement, "in a similar manner" is pretty vague, and "Science Pipelines" is unclear about whether this is code or people. I'd say that as long as the data is made available according to the standard data access interfaces, nothing more can usefully be (or need be) said about processing.
On the other hand, if particular desired outputs can be specified, that would be suitable for a second requirement.
You mean timeline in the sense of how long from acquiring the data in eTraveler before it's available in the data backbone? (ok, without us saying it will be in the data backbone). Also, are the data permanently archived in the Archive Center once they arrive or do they get purged? Is there a plan for the entire contents of eTraveler to be migrated to the data backbone before operations start?
We (== LSST) need to decide how we represent the data that's currently in eTraveller at Cerro Pachon. I'd include this decision or product in this or another requirement.
To the question: I don't think that all eTraveller data need necessarily be ingested for DM's use. However, we do need to get at the data with a low latency (this is needed for the auxTel too) – if we take auxTel data, it needs to be available within a small number of seconds (1s? 2s?). The same latency should be supported for CCOB, subject to network limitations.
No, by timeline I meant that this is not only a requirement on the fully-constructed DM system, like almost all of our other requirements, but instead this is a requirement on a during-Construction version of the DM system.
During Construction, the permanent archive for CCOB data rests with the Camera. At or before the start of Operations, this must be transferred, along with whatever eTraveler contents must be preserved, to DM.
There seems to be a big difference between “transfer this bunch of data from eTraveler to NCSA” and “annotate these files for transfer as they are acquired and ingest them within 1 minute”.
I don't think the eTraveller situation during operations is specified anywhere yet. Are you talking about eTraveller being used at the summit when the camera first arrives but before it's installed on the telescope for commissioning? Is AuxTel relevant to this ticket?
I tried to indicate that I didn't literally mean the current eTraveller, but could have been clearer. I meant that we need to get at the information that (for e.g. TS8) is in the data store and indexed in eTraveller with very low latency.
We should be treating photo-lsstCam data taken with the CCOB as if it were lsstCam or auxTel data taken on the mountain, and there we will need immediate access to the data. So we're going to have to face this problem at some point, and I think we should face it now.
I would be unhappy with a solution that works for CCOB with a 60s delay before DM can see the data. This would strongly discourage the camera from ever learning to use the LSST Stack.
After talking to Gregory Dubois-Felsmann and Kian-Tat Lim it's not clear that anything in LSE-61 can constrain camera testing. A DM requirement along the lines of:
DM shall be able to archive a designated subset of Camera test data and make it available in an environment matching the data backbone interfaces.
sounds reasonable and does not involve camera. We can write DM code to get the files from eTraveller and DM code to transfer it, and DM code to ingest it, but we aren't requiring camera to use an OCS and the official Archiver.
I understand that DM can't constrain the rest of the project, although people in DM can try! Shouldn't we add some latency to Tim's text in the previous comment? We need as low a latency as possible given camera deliverables, and isn't that something that we can require?
This then becomes a requirement for a data detection system to be running. This can't be the normal Archiver because there is no OCS running. We would need to write new code that finds the data in eTraveller as it arrives and triggers the transfer. Kian-Tat Lim may want to comment on the feasibility of doing that.
I think I would avoid using LSE-61 to start specifying performance and implied architectural constraints on behaviors of temporary data transfer mechanisms that should exist only during Construction / Early Integration. We're not using LSE-61 in any other place (as far as I can recall in the moment) to constrain phasing like that.
It makes sense to require in LSE-61 that DM be able to ingest, Butlerize, and interpret (i.e., with a suitable obs_* package) Construction-era CCOB data. Performance and design of the mechanisms that support that now should be documented as part of definition of Early Integration activities, I think - expanding their scope beyond OCS-centric activities to these more data-centric activities.
The performance constraints could also be part of the milestone definition that DM-13073 mentions needs to be developed. (Though LDM-503 may ultimately be the right place only for the DM parts of this cross-subsystem testing milestone.)
An LSE-* document defining the early integration milestones would be a good thing...
At the CCB meeting today we decided that I will submit this requirement text to the upstream LCR.
Suggested requirements text was submitted to LCR on 2018-02-13.
Is this two distinct requirements? One that we will permanently archive data off these instruments. Second that we agree to write code to process and analyze the data? Does it have to be automated processing? Are we doing bulk processing and archiving the results?