Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-21419

The definition of which timestamp the EFD uses should be done by the SAL Kafka producer

    Details

    • Story Points:
      0
    • Sprint:
      TSSW Sprint - Sep 16 - Sep 28, TSSW Sprint - Sep 29 - Oct 13, TSSW Sprint - Oct 14 - Oct 27
    • Team:
      Telescope and Site

      Description

      In the future, we'll record EFD data in multiple stores like InfluxDB, Oracle, Parquet etc

      Kafka replicates data using connectors, right now the decision of which timestamp is used as the InfluxDB time is done in the connector configuration, by a KSQL query:

      INSERT INTO mytopic SELECT * FROM mytopic WITHTIMESTAMP private_sndStamp
      

      a similar configuration would be done in the Oracle connector to define which timestamp field should be indexed.

      To avoid making this decision in different connectors and potentially in different deployments of the EFD, the definition of which timestamp to use in the EFD should be done in the SAL Kafka Producer instead.

      There are different ways to accomplish this.

      The easiest one is to create a new field in the Avro schema called, for example, private_efdStamp which is a copy of whatever timestamp field we should use. Initially, that can be hardcoded to use private_sndStamp.

      The other way is making use of "aliases" in Avro:

      https://avro.apache.org/docs/1.8.1/spec.html#Aliases

      that create an alternate name private_efdStamp for the timestamp field we want to use.

      NOTE: we assume aliases work the way we expect but that needs to be tested. In the above KSQL query SELECT * FROM mytopic ensures that all fields in mytopic will be inserted as fields in InfluxDB with their original names and WITHTIMESTAMP private_efdStamp will use the alternate name to use the right timestamp as the InfluxDB timestamp.

      Either way, the different connectors always use private_efdStamp.

      Another benefit of doing this in the SAL Kafka producer code or in the Avro schema is that changes to the code or to the Avro schema are versioned, while changes in the connector configuration are not.

      This ticket is not blocking anything in the moment. But having the private_efdStamp would make the EFD configuration more realiable.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                rowen Russell Owen
                Reporter:
                afausti Angelo Fausti
                Reviewers:
                Angelo Fausti
                Watchers:
                Angelo Fausti, Frossie Economou, Patrick Ingraham, Russell Owen, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel