Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: ts_auxiliary_telescope
-
Labels:
-
Story Points:0
-
Epic Link:
-
Sprint:TSSW Sprint - Sep 16 - Sep 28, TSSW Sprint - Sep 29 - Oct 13, TSSW Sprint - Oct 14 - Oct 27
-
Team:Telescope and Site
Description
In the future, we'll record EFD data in multiple stores like InfluxDB, Oracle, Parquet etc
Kafka replicates data using connectors, right now the decision of which timestamp is used as the InfluxDB time is done in the connector configuration, by a KSQL query:
INSERT INTO mytopic SELECT * FROM mytopic WITHTIMESTAMP private_sndStamp
|
a similar configuration would be done in the Oracle connector to define which timestamp field should be indexed.
To avoid making this decision in different connectors and potentially in different deployments of the EFD, the definition of which timestamp to use in the EFD should be done in the SAL Kafka Producer instead.
There are different ways to accomplish this.
The easiest one is to create a new field in the Avro schema called, for example, private_efdStamp which is a copy of whatever timestamp field we should use. Initially, that can be hardcoded to use private_sndStamp.
The other way is making use of "aliases" in Avro:
https://avro.apache.org/docs/1.8.1/spec.html#Aliases
that create an alternate name private_efdStamp for the timestamp field we want to use.
NOTE: we assume aliases work the way we expect but that needs to be tested. In the above KSQL query SELECT * FROM mytopic ensures that all fields in mytopic will be inserted as fields in InfluxDB with their original names and WITHTIMESTAMP private_efdStamp will use the alternate name to use the right timestamp as the InfluxDB timestamp.
Either way, the different connectors always use private_efdStamp.
Another benefit of doing this in the SAL Kafka producer code or in the Avro schema is that changes to the code or to the Avro schema are versioned, while changes in the connector configuration are not.
This ticket is not blocking anything in the moment. But having the private_efdStamp would make the EFD configuration more realiable.
For the record: I am willing to use aliases, if that works. I will produce a ticket branch with an alias so we can try that. But I am not at all convinced we should bother. private_sndStamp is the only field that makes any sense to use for the long run. The main reason I see to make this configurable is if we wanted to use private_rcvStamp in the short term in order to reliably get TAI. But Patrick Ingraham prefers that we use private_sndStamp now and fix the 37 second error as soon as practical.
In other words: my personal preference would be to not merge this work.