Fix Version/s: None
Sprint:TSSW Sprint - Sep 16 - Sep 28, TSSW Sprint - Sep 29 - Oct 13, TSSW Sprint - Oct 14 - Oct 27
Team:Telescope and Site
In the future, we'll record EFD data in multiple stores like InfluxDB, Oracle, Parquet etc
Kafka replicates data using connectors, right now the decision of which timestamp is used as the InfluxDB time is done in the connector configuration, by a KSQL query:
a similar configuration would be done in the Oracle connector to define which timestamp field should be indexed.
To avoid making this decision in different connectors and potentially in different deployments of the EFD, the definition of which timestamp to use in the EFD should be done in the SAL Kafka Producer instead.
There are different ways to accomplish this.
The easiest one is to create a new field in the Avro schema called, for example, private_efdStamp which is a copy of whatever timestamp field we should use. Initially, that can be hardcoded to use private_sndStamp.
The other way is making use of "aliases" in Avro:
that create an alternate name private_efdStamp for the timestamp field we want to use.
NOTE: we assume aliases work the way we expect but that needs to be tested. In the above KSQL query SELECT * FROM mytopic ensures that all fields in mytopic will be inserted as fields in InfluxDB with their original names and WITHTIMESTAMP private_efdStamp will use the alternate name to use the right timestamp as the InfluxDB timestamp.
Either way, the different connectors always use private_efdStamp.
Another benefit of doing this in the SAL Kafka producer code or in the Avro schema is that changes to the code or to the Avro schema are versioned, while changes in the connector configuration are not.
This ticket is not blocking anything in the moment. But having the private_efdStamp would make the EFD configuration more realiable.