Status: To Do
Fix Version/s: None
Hsin-Fang reports encountering an FileNotFound error during a run even though s3CheckFileExists returns True and manually navigating to the faulty Key in the Bucket clearly shows it existing.
Two potential issues at fault here can be:
- transient network/connectivity issue
- S3 Eventual Consistency model
The error appeared in Butler.ingest functionality following a Butler.put of the same exact dataset in question. It is not a far fetched idea that due to S3 consistency model the Bucket has not yet been "updated" with the newly inserted Key in the time between the key was placed there, in Butler.put, and the time the key was checked for existence, in Butler.ingest.
In this case a waiting loop allowing the eventual consistency model to catch up would fix the problem.
In the case of network/connectivity issue it would be nicer to have a more specific error.
- relates to
DM-25818 S3Datastore tests existence before writing
I agree that the checks for whether the file appeared should be performed as a follow up action to put. This can lock down the S3Datastore down for a little while but is the best solution going forward. There are tests if the file exists in the ingest functionality. That is the error that is eventually displayed. Looking back at the code it occurred to me that I could strengthen it a bit I believe so I cast a wider net just in case.
I hadn't had a look yet. Thanks for the input.
Are we doing an existence check first? In that case, a wait may be necessary. If so, it may be better to wait in the put method until the dataset is known to exist. If we're not doing an existence check, get after put should be read-after-write consistent. A more explicit explanation can be found in, e.g. https://codeburst.io/quick-explanation-of-the-s3-consistency-model-6c9f325e3f82