Currently dataset IDs in butler registries are stored as auto-incrementing integers. This works fine for a standalone registry that will never receive datasets from other registries.
The middleware team would like to change the dataset ID to instead use a UUID. This is required to simplify the change we are making to batch processing where batch jobs are given a prepopulated SQLite registry and at the end of the processing the new datasets are merged into the new registry. This process is simplified significantly if the UUIDs generated by the batch job are retained during the merge with the main registry.
This UUID system will also allow us to ingest raw files predictably such that a UUID in a registry in the OODS (or any other registry) matches that at the data facility even though the file has been ingested independently.
Since this requires a schema change the RFC will be flagged to DMCCB. The UUID code is implemented and we are currently working on migration scripts. We would want to change over the main NCSA and IDF repositories to enable them to make use of the UUID features.