Is it possible for a datastore.trash to happen on this dataset, then having a new put occurring at the same time as some other process is doing the emptyTrash (which could remove a file that was just written). This requires that the new put and the old dataset share the same dataId and collection.
I believe it is possible; once the dataset has been trashed, the registry will allow a new one with the same dataset_type, collection, and data ID. And the result would be that the datastore internal records and the locations table both claim the dataset exists, even though it really doesn't. That's a pretty mild form of repo corruption for something that can only happen if someone is already doing something pretty inadvisable, and I think it's totally recoverable in the sense that you can just trash and empty trash again on those data IDs. But it's also a fairly hard one to detect (you actually have to try to get the dataset to notice something has gone awry), so if we're concerned that someone is going to do something inadvisable, it may be worth working harder to address it.
The simple, brute-force way to fix this would be to always include the dataset_id in the filename template; that will just avoid the clash entirely.
Another approach would involve defining a UNIQUE constraint in the internal datastore records that would prohibit that concurrent put. The constraint would need to at least include the URI and whatever columns need to go with that to cover entries that should be allowed to have the same URI (e.g. for composition). With that in place, I think a concurrent put that happens just before the trash is emptied will fail at the Registry level (perhaps not before it actually writes to datastore storage, but then that's just one more case of "don't trust datastore", because the database transaction will still roll back both the internal datastore records insertion and the dataset_location insertion).
Dino Bektesevic ingest has been completely rewritten so don't worry about it.