Kian-Tat Lim, I was trying to imagine a way to make it work in current schema. I agree that extending schema with "canonical id" would work, but there are non-trivial issues with that:
- we need to upgrade registry schema (though this is rather trivial at this point in time)
- building canonical ID in completely backward- an forward-compatible way is problematic and it does not fit very well in the schema that we have now
- I think it will be either rather inefficient space-wise or will need to rely on some database locking mechanism (which I want to avoid as potentially non-portable).
One implementation of canonical id that I can imagine is just a string representation of a DatasetRef (DatasetType and DataId part of it, e.g. Patch(patch=42,skymap=MySkyMap,tract=100)). This should be done very carefully to avoid ambiguities and keep it compatible w.r.t. potential schema changes (which is hard when you cannot predict the future).
Still, I agree with one thing - we need table-level constraint check for this, otherwise things will get very ugly. I think implementing that kind of thing is beyond the scope of this ticket, what I want to do here is to make some trivial check that works in a single-user environment, basically more or less the same thing that we have today in addDataset() but try to make it in a more efficient way.