Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Labels:
-
Story Points:10
-
Epic Link:
-
Sprint:DB_S21_12, DB_F21_06
-
Team:Data Access and Database
-
Urgent?:No
Description
With some initial work done on DM-29593 it is time now to switch to specific task of migrating datasets schema to UUID keys. Few items that need some work:
- we have relatively large number of records in existing repos, tens(s) of millions
- this needs special care for performance, there are couple of possible options for how to perform this migratrion:
- dump everything into CSV, run bulk transforms on those CSVs and import it back into database
- make special mapping table from int dataset_id into UUID and generate new tables by doing "INSERT ... SELECT ... JOIN"
- latter should be cleaner but there is a question of JOIN performance which I want to study first
Attachments
Issue Links
- is blocked by
-
DM-29593 Design migration system for data repositories
- Done
- is triggered by
-
DM-29593 Design migration system for data repositories
- Done
-
RFC-777 Change dataset ID type in butler registries to a UUID
- Implemented
- relates to
-
DM-30316 Write UUID migration script for sqlite
- Done
-
DM-29765 Migrate gen3 repos to new schema
- Done
- mentioned in
-
Page Loading...
(1 mentioned in)
I'll summarize things separately, here is just a quick result of today's migration from the notes that I made: