Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: ts_middleware
-
Labels:
-
Story Points:0
-
Sprint:TSSW Sprint - Sep 26 - Oct 10
-
Team:Telescope and Site
-
Urgent?:No
Description
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer is computing topic offsets to read historical data while the Kafka broker is purging data? It is clear that the Consumer may specify offsets to data that no longer exists. The question is: what happens if a Consumer specifies a topic offset for which no data is available?
I did some prototype testing and research and found the following:
The default behavior is to ignore all historical data and only read new data, which is not at all what we want. However, Consumer option "auto.offset.reset": "earliest" will make the consumer read the oldest data, which is exactly what we want.
So: add that option when constructing the consumer, and also document each option so readers understand why it is present.
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Description |
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer seeks back to a position for which data is no longer available.
I did some testing and found that the default behavior is to start reading newest data if there is no data at the specified offset. If we change that to reading the oldest data, then everything works as desired (based on some tests). To do this, specify option {{"auto.offset.reset": "earliest"}} when constructing the Consumer. Also, document that option and the others being specified, so readers know why each option is present. |
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer seeks back to a position for which data is no longer available.
I did some research and testing and found that the default behavior is to start reading newest data if there is no data at the specified offset. If we change that to reading the oldest data, then everything works as desired (based on some tests). To do this, specify option {{"auto.offset.reset": "earliest"}} when constructing the Consumer. Also, document that option and the others being specified, so readers know why each option is present. |
Status | To Do [ 10001 ] | In Progress [ 3 ] |
Reviewers | Eric Coughlin [ ecoughlin ] | |
Status | In Progress [ 3 ] | In Review [ 10004 ] |
Story Points | 1 | 0 |
Description |
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer seeks back to a position for which data is no longer available.
I did some research and testing and found that the default behavior is to start reading newest data if there is no data at the specified offset. If we change that to reading the oldest data, then everything works as desired (based on some tests). To do this, specify option {{"auto.offset.reset": "earliest"}} when constructing the Consumer. Also, document that option and the others being specified, so readers know why each option is present. |
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer is computing topic offsets to read historical data while the Kafka broker is purging data? It is clear that the Consumer may specify offsets to data that no longer exists. The question is: what happens if a Consumer specifies a topic offset for which no data is available?
I did some prototype testing and research and found the following: The default behavior is to ignore all historical data and only read new data, which is not at all what we want. However, Consumer option {{"auto.offset.reset": "earliest"}} will make the consumer read the oldest data, which is exactly what we want. In addition to adding that option when constructing the consumer, also document each option, so readers understand why it is present. |
Description |
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer is computing topic offsets to read historical data while the Kafka broker is purging data? It is clear that the Consumer may specify offsets to data that no longer exists. The question is: what happens if a Consumer specifies a topic offset for which no data is available?
I did some prototype testing and research and found the following: The default behavior is to ignore all historical data and only read new data, which is not at all what we want. However, Consumer option {{"auto.offset.reset": "earliest"}} will make the consumer read the oldest data, which is exactly what we want. In addition to adding that option when constructing the consumer, also document each option, so readers understand why it is present. |
For some time I have been concerned about a potential race condition in the way Kafka ts_salobj gets historical data. The issue is what happens if a Kafka Consumer is computing topic offsets to read historical data while the Kafka broker is purging data? It is clear that the Consumer may specify offsets to data that no longer exists. The question is: what happens if a Consumer specifies a topic offset for which no data is available?
I did some prototype testing and research and found the following: The default behavior is to ignore all historical data and only read new data, which is not at all what we want. However, Consumer option {{"auto.offset.reset": "earliest"}} will make the consumer read the oldest data, which is exactly what we want. So: add that option when constructing the consumer, and also document each option so readers understand why it is present. |
Status | In Review [ 10004 ] | Reviewed [ 10101 ] |
Resolution | Done [ 10000 ] | |
Status | Reviewed [ 10101 ] | Done [ 10002 ] |
Pull request: https://github.com/lsst-ts/ts_salobj/pull/255