Details
-
Type:
Story
-
Status: Invalid
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Labels:
-
Story Points:2
-
Epic Link:
-
Sprint:DRP S19-2
-
Team:Data Release Production
Description
Move schema definitions out of YAML and into Python, and particular into overridable methods so it's easier to customize for different DBs. These will probably be on a separate class to be overridden along with Registry instead of on Registry itself so we don't pollute Registry with tons of _createTableX methods.
I'd also like to cook up some approach to make it easier to have different DB implementations generate compatible schemas (or at least make it difficult have them define incompatible schemas) from the perspective of still having the common code for running at least most SELECT queries across databases. That may involve keeping some of the YAML around and using it either for validation or to extract something to pass to the Python methods.
An alternative would be waiting for Brian Van Klaveren's Felis to become an alternative, but I both don't think we can wait and I don't think we can reasonably expect that to handle the case where one DB decides to use a view to one or more private tables where another just uses a single public table. Its role will probably be in providing a schema that we can validate per-DB Python-defined schemas against instead.
I had originally planned to put this off until other things are done, but it's both looking easier than I expected and desirable for
DM-16227(which would otherwise require some extensions to the YAML definition system), so I'm going to give it a time-boxed attempt today and Monday.If that bogs down I will bail out and do
DM-16227by modifying the YAML definition system and return to this later.