Uploaded image for project: 'Commissioning Activity Planning'
  1. Commissioning Activity Planning
  2. CAP-816

High-Frequency Data Array Timestamp Definition

    XMLWordPrintable

Details

    • Task
    • Status: To Do
    • Blocker
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      The only "time of read" timestamp for most/all data-arrays is from a single time (such as cRIO_timestamp).  We apparently need to establish a better way of defining how to unpack these sequential data arrays to re-construct the time-stream telemetry (for example to rendezvous with other EFD telemetry).  Certainly can be done by hand, but we need this to happen with minimal effort for most end-users.

      Attachments

        Issue Links

          Activity

            dmills Dave Mills added a comment - https://ts-xml.lsst.io/v/tickets-dm-20234/sal_constraints_and_recommendations.html#timestamps has the naming conventions for timestamp arrays

            so these aren't required?  this seems to be the root problem.

            bstalder Brian Stalder added a comment - so these aren't required?  this seems to be the root problem.
            cslage Craig Lage added a comment -

            FYI, I have a technote at https://sitcomtn-018.lsst.io/ which attempts to summarize the current status.

            cslage Craig Lage added a comment - FYI, I have a technote at https://sitcomtn-018.lsst.io/ which attempts to summarize the current status.

            From the EFD perspective, if we could avoid packed data that would be much better.

            Packed data is bad because you cannot store them properly in InfluxDB, arrays get unpacked into individual fields and all values have the same timestamp, which makes it impossible to visualize the time series in Chronograf, for example.

            It is also bad for fine-tuning performance in the EFD, when you send the data to InfluxDB you can configure Kafka to send them in batch by configuring a batch size. Packed data is essentially batching DDS samples already.

            afausti Angelo Fausti added a comment - From the EFD perspective, if we could avoid packed data that would be much better. Packed data is bad because you cannot store them properly in InfluxDB, arrays get unpacked into individual fields and all values have the same timestamp, which makes it impossible to visualize the time series in Chronograf, for example. It is also bad for fine-tuning performance in the EFD, when you send the data to InfluxDB you can configure Kafka to send them in batch by configuring a batch size. Packed data is essentially batching DDS samples already.

            afausti  my understanding packing was to avoid sending high-frequency data as it was being collected because we had issues with writing to the EFD.  Is there an alternative to this array packing?

            bstalder Brian Stalder added a comment - afausti   my understanding packing was to avoid sending high-frequency data as it was being collected because we had issues with writing to the EFD.  Is there an alternative to this array packing?

            bstalder we should talk about this at the meeting tomorrow, but my understanding was that this was put in place before the current EFD design was implemented. I'm not sure if it was because of the original EFD design or because of concerns about DDS. I don't think the kafka->influxdb design imposed the packing.

            krughoff Simon Krughoff (Inactive) added a comment - - edited bstalder we should talk about this at the meeting tomorrow, but my understanding was that this was put in place before the current EFD design was implemented. I'm not sure if it was because of the original EFD design or because of concerns about DDS. I don't think the kafka->influxdb design imposed the packing.

            My understanding is VMS produces packed data primary due to SAL/DDS concerns. I can run tests, but given SAL/DDS overhead, it's hard to believe producing 1000 (1kHz) data messages instead of 20 (packed by 50) longer messages would not be felt on SAL/DDS. As I advertised, VMS comes with Python code to unpack data and store unpacked data in HDF5 files (and display unpacked data). If you can introduce that code somewhere in EFD investors, that would allow us to produce packed data yet still have correct single record data with timestamp in EFD.

            pkubanek Petr Kubanek added a comment - My understanding is VMS produces packed data primary due to SAL/DDS concerns. I can run tests, but given SAL/DDS overhead, it's hard to believe producing 1000 (1kHz) data messages instead of 20 (packed by 50) longer messages would not be felt on SAL/DDS. As I advertised, VMS comes with Python code to unpack data and store unpacked data in HDF5 files (and display unpacked data). If you can introduce that code somewhere in EFD investors, that would allow us to produce packed data yet still have correct single record data with timestamp in EFD.
            dmills Dave Mills added a comment -

            Yes, if you send DDS data in a 1 value + timestamp format topic for high rate telemetry you would
            incur ~54 bytes per packet overhead to send 16 assuming your data is double precision.
            That is why there is a spec for how to have arrays and associated timestamps in a single
            topic; to avoid such on-the-wire inefficiency. The producers could automatically unpack
            this for influxdb if we stick to a standard naming convention

            dmills Dave Mills added a comment - Yes, if you send DDS data in a 1 value + timestamp format topic for high rate telemetry you would incur ~54 bytes per packet overhead to send 16 assuming your data is double precision. That is why there is a spec for how to have arrays and associated timestamps in a single topic; to avoid such on-the-wire inefficiency. The producers could automatically unpack this for influxdb if we stick to a standard naming convention
            mareuter Michael Reuter added a comment - - edited

            What this means is that the cRIO_timestamp attribute needs to be a packed array like the other data. This means a change to the interface XML. Then once the interface is available the LabView code needs to be changed to provide a measurement time for each data point in the packed array. I'd recommend this for the AT related data that uses the packed arrays.

            mareuter Michael Reuter added a comment - - edited What this means is that the cRIO_timestamp attribute needs to be a packed array like the other data. This means a change to the interface XML. Then once the interface is available the LabView code needs to be changed to provide a measurement time for each data point in the packed array. I'd recommend this for the AT related data that uses the packed arrays.

            C++ code needs to be changed for VMS. I can do that.

            pkubanek Petr Kubanek added a comment - C++ code needs to be changed for VMS. I can do that.
            cslage Craig Lage added a comment -

            I did a little more work on this. On the AuxTel, there is lsst.sal.ATMCS.logevent_target data, which is non-packed data which is the target that the mount is aiming for. We can then compare this to the packed lsst.sal.ATMCS.mount_AzEl_Encoders data and ask how much we have to shift the packed data so it lines up with the unpacked target. The answer is we have to shift the packed data about 1.67 seconds earlier. Then everything lines up nicely, at least within the error I can see. See the plots below, made by this notebook: https://github.com/craiglagegit/ScratchStuff/blob/master/notebooks/Plot_Tracking_UTC_08Nov21.ipynb

            cslage Craig Lage added a comment - I did a little more work on this. On the AuxTel, there is lsst.sal.ATMCS.logevent_target data, which is non-packed data which is the target that the mount is aiming for. We can then compare this to the packed lsst.sal.ATMCS.mount_AzEl_Encoders data and ask how much we have to shift the packed data so it lines up with the unpacked target. The answer is we have to shift the packed data about 1.67 seconds earlier. Then everything lines up nicely, at least within the error I can see. See the plots below, made by this notebook: https://github.com/craiglagegit/ScratchStuff/blob/master/notebooks/Plot_Tracking_UTC_08Nov21.ipynb

            I just realized this is assigned to me. I still don't think this should be the responsibility of the lsst_efd_client package since it's inherent to the system, not a data handling issue. I guess I could add some mechanism to include an offset factor per topic, or something, but I would be very uncomfortable defaulting that.

            krughoff Simon Krughoff (Inactive) added a comment - I just realized this is assigned to me. I still don't think this should be the responsibility of the lsst_efd_client package since it's inherent to the system, not a data handling issue. I guess I could add some mechanism to include an offset factor per topic, or something, but I would be very uncomfortable defaulting that.
            cslage Craig Lage added a comment -

            I agree. I thought the agreed-upon fix was to add a timestamp for each datapoint in the packed array, as mareuter suggested above.

            cslage Craig Lage added a comment - I agree. I thought the agreed-upon fix was to add a timestamp for each datapoint in the packed array, as mareuter suggested above.

            People

              krughoff Simon Krughoff (Inactive)
              bstalder Brian Stalder
              Angelo Fausti, Brian Stalder, Craig Lage, Dave Mills, Michael Reuter, Petr Kubanek, Robert Lupton, Simon Krughoff (Inactive), Stratejos Slack Intergration bot
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Jenkins

                  No builds found.