Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13072

New requirement to explicitly deal with CCOB data

    XMLWordPrintable

    Details

    • Team:
      Architecture

      Description

      Informally we agree to handle data coming from Camera Calibration Optical Bench (CCOB) as part of early integration. To make this more formal we need to place a requirement on DM, in LSE-61 to be capable of archiving, and performing a meaningful processing on, data from Camera I&T data collection on the CCOB.

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            Is this two distinct requirements? One that we will permanently archive data off these instruments. Second that we agree to write code to process and analyze the data? Does it have to be automated processing? Are we doing bulk processing and archiving the results?

            Show
            tjenness Tim Jenness added a comment - Is this two distinct requirements? One that we will permanently archive data off these instruments. Second that we agree to write code to process and analyze the data? Does it have to be automated processing? Are we doing bulk processing and archiving the results?
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            I don't think it has to be automated in the full sense of, say, Operations-era Prompt Processing.

            The real action here is for someone (from Science Pipelines) to configure an appropriate pipeline, and someone from a "DM Early Operations" group (whatever that means in practice) to run that pipeline fairly routinely on data from the CCOB.

            I'd be surprised if it were appropriate to process all CCOB data; most likely there will only be certain configurations that are worth processing. Some input from Robert Lupton might be helpful to sharpen this up.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - I don't think it has to be automated in the full sense of, say, Operations-era Prompt Processing. The real action here is for someone (from Science Pipelines) to configure an appropriate pipeline, and someone from a "DM Early Operations" group (whatever that means in practice) to run that pipeline fairly routinely on data from the CCOB. I'd be surprised if it were appropriate to process all CCOB data; most likely there will only be certain configurations that are worth processing. Some input from Robert Lupton might be helpful to sharpen this up.
            Hide
            rhl Robert Lupton added a comment -

            The requirement that I'm interested in is that we can run DM analysis code on archived CCOB data.  This requires that it appear somewhere (NCSA?) and be properly ingested into a repository;  that we be able to instantiate a butler that points to that repository;  and that DM be able to process that data in an way that could be automated (i.e. currently by issuing a command from the shell that processes all or a subset of the data;  soon (I presume) that the processing be coordinated by SuperTask).  

            Initially the processing will be running the DM-stack enabled version of the Camera eotest scripts that Merlin Fisher-Levine has been working on.

             

            Show
            rhl Robert Lupton added a comment - The requirement that I'm interested in is that we can run DM analysis code on archived CCOB data.  This requires that it appear somewhere (NCSA?) and be properly ingested into a repository;  that we be able to instantiate a butler that points to that repository;  and that DM be able to process that data in an way that could be automated (i.e. currently by issuing a command from the shell that processes all or a subset of the data;  soon (I presume) that the processing be coordinated by SuperTask).   Initially the processing will be running the DM-stack enabled version of the Camera eotest scripts that Merlin Fisher-Levine has been working on.  
            Hide
            tjenness Tim Jenness added a comment -

            Something like: ?

            DMS-REQ-????? Archiving of CCOB data

            Specification: The Archive Centre shall be capable of archiving CCOB data and making it available to Science Pipelines users via the standard data access interfaces.

            DMS-REQ-????? Processing of CCOB data

            Specification: The Science Pipelines shall be able to process CCOB data in a similar manner to standard LSST data.

            Show
            tjenness Tim Jenness added a comment - Something like: ? DMS-REQ-????? Archiving of CCOB data Specification: The Archive Centre shall be capable of archiving CCOB data and making it available to Science Pipelines users via the standard data access interfaces. DMS-REQ-????? Processing of CCOB data Specification: The Science Pipelines shall be able to process CCOB data in a similar manner to standard LSST data.
            Hide
            ktl Kian-Tat Lim added a comment -

            I think something needs to be said about timelines; this is a requirement on the DM System prior to the completion of Construction.

            With regard to the second requirement, "in a similar manner" is pretty vague, and "Science Pipelines" is unclear about whether this is code or people. I'd say that as long as the data is made available according to the standard data access interfaces, nothing more can usefully be (or need be) said about processing.

            Show
            ktl Kian-Tat Lim added a comment - I think something needs to be said about timelines; this is a requirement on the DM System prior to the completion of Construction. With regard to the second requirement, "in a similar manner" is pretty vague, and "Science Pipelines" is unclear about whether this is code or people. I'd say that as long as the data is made available according to the standard data access interfaces, nothing more can usefully be (or need be) said about processing.
            Hide
            ktl Kian-Tat Lim added a comment -

            On the other hand, if particular desired outputs can be specified, that would be suitable for a second requirement.

            Show
            ktl Kian-Tat Lim added a comment - On the other hand, if particular desired outputs can be specified, that would be suitable for a second requirement.
            Hide
            tjenness Tim Jenness added a comment -

            You mean timeline in the sense of how long from acquiring the data in eTraveler before it's available in the data backbone? (ok, without us saying it will be in the data backbone). Also, are the data permanently archived in the Archive Center once they arrive or do they get purged? Is there a plan for the entire contents of eTraveler to be migrated to the data backbone before operations start?

            Show
            tjenness Tim Jenness added a comment - You mean timeline in the sense of how long from acquiring the data in eTraveler before it's available in the data backbone? (ok, without us saying it will be in the data backbone). Also, are the data permanently archived in the Archive Center once they arrive or do they get purged? Is there a plan for the entire contents of eTraveler to be migrated to the data backbone before operations start?
            Hide
            rhl Robert Lupton added a comment -

            We (== LSST) need to decide how we represent the data that's currently in eTraveller at Cerro Pachon.  I'd include this decision or product in this or another requirement.

            To the question:  I don't think that all eTraveller data need necessarily be ingested for DM's use.  However, we do need to get at the data with a low latency (this is needed for the auxTel too) – if we take auxTel data, it needs to be available within a small number of seconds (1s? 2s?).  The same latency should be supported for CCOB, subject to network limitations.

             

            Show
            rhl Robert Lupton added a comment - We (== LSST) need to decide how we represent the data that's currently in eTraveller at Cerro Pachon.  I'd include this decision or product in this or another requirement. To the question:  I don't think that all eTraveller data need necessarily be ingested for DM's use.  However, we do need to get at the data with a low latency (this is needed for the auxTel too) – if we take auxTel data, it needs to be available within a small number of seconds (1s? 2s?).  The same latency should be supported for CCOB, subject to network limitations.  
            Hide
            ktl Kian-Tat Lim added a comment -

            No, by timeline I meant that this is not only a requirement on the fully-constructed DM system, like almost all of our other requirements, but instead this is a requirement on a during-Construction version of the DM system.

            During Construction, the permanent archive for CCOB data rests with the Camera.  At or before the start of Operations, this must be transferred, along with whatever eTraveler contents must be preserved, to DM.

            Show
            ktl Kian-Tat Lim added a comment - No, by timeline I meant that this is not only a requirement on the fully-constructed DM system, like almost all of our other requirements, but instead this is a requirement on a during-Construction version of the DM system. During Construction, the permanent archive for CCOB data rests with the Camera.  At or before the start of Operations, this must be transferred, along with whatever eTraveler contents must be preserved, to DM.
            Hide
            tjenness Tim Jenness added a comment -

            There seems to be a big difference between “transfer this bunch of data from eTraveler to NCSA” and “annotate these files for transfer as they are acquired and ingest them within 1 minute”.

            I don't think the eTraveller situation during operations is specified anywhere yet. Are you talking about eTraveller being used at the summit when the camera first arrives but before it's installed on the telescope for commissioning? Is AuxTel relevant to this ticket?

            Show
            tjenness Tim Jenness added a comment - There seems to be a big difference between “transfer this bunch of data from eTraveler to NCSA” and “annotate these files for transfer as they are acquired and ingest them within 1 minute”. I don't think the eTraveller situation during operations is specified anywhere yet. Are you talking about eTraveller being used at the summit when the camera first arrives but before it's installed on the telescope for commissioning? Is AuxTel relevant to this ticket?
            Hide
            rhl Robert Lupton added a comment -

            I tried to indicate that I didn't literally mean the current eTraveller, but could have been clearer.   I meant that we need to get at the information that (for e.g. TS8) is in the data store and indexed in eTraveller with very low latency.

            We should be treating photo-lsstCam data taken with the CCOB as if it were lsstCam or auxTel data taken on the mountain, and there we will need immediate access to the data.  So we're going to have to face this problem at some point, and I think we should face it now.

            I would be unhappy with a solution that works for CCOB with a 60s delay before DM can see the data.  This would strongly discourage the camera from ever learning to use the LSST Stack.

            Show
            rhl Robert Lupton added a comment - I tried to indicate that I didn't literally mean the current eTraveller, but could have been clearer.   I meant that we need to get at the information that (for e.g. TS8) is in the data store and indexed in eTraveller with very low latency. We should be treating photo-lsstCam data taken with the CCOB as if it were lsstCam or auxTel data taken on the mountain, and there we will need immediate access to the data.  So we're going to have to face this problem at some point, and I think we should face it now. I would be unhappy with a solution that works for CCOB with a 60s delay before DM can see the data.  This would strongly discourage the camera from ever learning to use the LSST Stack.
            Hide
            tjenness Tim Jenness added a comment -

            After talking to Gregory Dubois-Felsmann and Kian-Tat Lim it's not clear that anything in LSE-61 can constrain camera testing. A DM requirement along the lines of:

            DM shall be able to archive a designated subset of Camera test data and make it available in an environment matching the data backbone interfaces.

            sounds reasonable and does not involve camera. We can write DM code to get the files from eTraveller and DM code to transfer it, and DM code to ingest it, but we aren't requiring camera to use an OCS and the official Archiver.

            Show
            tjenness Tim Jenness added a comment - After talking to Gregory Dubois-Felsmann and Kian-Tat Lim it's not clear that anything in LSE-61 can constrain camera testing. A DM requirement along the lines of: DM shall be able to archive a designated subset of Camera test data and make it available in an environment matching the data backbone interfaces. sounds reasonable and does not involve camera. We can write DM code to get the files from eTraveller and DM code to transfer it, and DM code to ingest it, but we aren't requiring camera to use an OCS and the official Archiver.
            Hide
            rhl Robert Lupton added a comment -

            I understand that DM can't constrain the rest of the project, although people in DM can try!   Shouldn't we add some latency to Tim's text in the previous comment?  We need as low a latency as possible given camera deliverables, and isn't that something that we can require?

            Show
            rhl Robert Lupton added a comment - I understand that DM can't constrain the rest of the project, although people in DM can try!   Shouldn't we add some latency to Tim's text in the previous comment?  We need as low a latency as possible given camera deliverables, and isn't that something that we can require?
            Hide
            tjenness Tim Jenness added a comment -

            This then becomes a requirement for a data detection system to be running. This can't be the normal Archiver because there is no OCS running. We would need to write new code that finds the data in eTraveller as it arrives and triggers the transfer. Kian-Tat Lim may want to comment on the feasibility of doing that.

            Show
            tjenness Tim Jenness added a comment - This then becomes a requirement for a data detection system to be running. This can't be the normal Archiver because there is no OCS running. We would need to write new code that finds the data in eTraveller as it arrives and triggers the transfer. Kian-Tat Lim may want to comment on the feasibility of doing that.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment - - edited

            I think I would avoid using LSE-61 to start specifying performance and implied architectural constraints on behaviors of temporary data transfer mechanisms that should exist only during Construction / Early Integration.  We're not using LSE-61 in any other place (as far as I can recall in the moment) to constrain phasing like that.

            It makes sense to require in LSE-61 that DM be able to ingest, Butlerize, and interpret (i.e., with a suitable obs_* package) Construction-era CCOB data.  Performance and design of the mechanisms that support that now should be documented as part of definition of Early Integration activities, I think - expanding their scope beyond OCS-centric activities to these more data-centric activities.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - - edited I think I would avoid using LSE-61 to start specifying performance and implied architectural constraints on behaviors of temporary data transfer mechanisms that should exist only during Construction / Early Integration.  We're not using LSE-61 in any other place (as far as I can recall in the moment) to constrain phasing like that. It makes sense to require in LSE-61 that DM be able to ingest, Butlerize, and interpret (i.e., with a suitable obs_* package) Construction-era CCOB data.  Performance and design of the mechanisms that support that now should be documented as part of definition of Early Integration activities, I think - expanding their scope beyond OCS-centric activities to these more data-centric activities.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            The performance constraints could also be part of the milestone definition that DM-13073 mentions needs to be developed.  (Though LDM-503 may ultimately be the right place only for the DM parts of this cross-subsystem testing milestone.)

            An LSE-* document defining the early integration milestones would be a good thing...

            Show
            gpdf Gregory Dubois-Felsmann added a comment - The performance constraints could also be part of the milestone definition that DM-13073 mentions needs to be developed.  (Though LDM-503 may ultimately be the right place only for the DM parts of this cross-subsystem testing milestone.) An LSE-* document defining the early integration milestones would be a good thing...
            Hide
            tjenness Tim Jenness added a comment -

            At the CCB meeting today we decided that I will submit this requirement text to the upstream LCR.

            Show
            tjenness Tim Jenness added a comment - At the CCB meeting today we decided that I will submit this requirement text to the upstream LCR.
            Hide
            tjenness Tim Jenness added a comment -

            Suggested requirements text was submitted to LCR on 2018-02-13.

            Show
            tjenness Tim Jenness added a comment - Suggested requirements text was submitted to LCR on 2018-02-13.

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              womullan Wil O'Mullane
              Reviewers:
              Robert Lupton
              Watchers:
              Gregory Dubois-Felsmann, Kian-Tat Lim, Lauren MacArthur, Merlin Fisher-Levine, Ranpal Gill, Robert Lupton, Tim Jenness, Wil O'Mullane
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Due:
                Created:
                Updated:
                Resolved:

                  Jenkins Builds

                  No builds found.