Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: dax_obscore
-
Labels:
-
Story Points:1
-
Sprint:DB_F22_6
-
Team:Data Access and Database
-
Urgent?:No
Description
dax_obscore CSV output produces empty value for NULL/None values. MySQL needs \N token on import from CVS for NULL, so currently we have to post-process CVS and substitute empty values with \N. Would be nice to output expected value without post-processing.
We use pyarrow for output, its CSV write has a some options for output format, but it does not support specifying NULL value. Actually its C++ code has that option: https://github.com/apache/arrow/blob/master/cpp/src/arrow/csv/options.h#L201, but Python wrapper for that class does not implement it. I commented on Arrow Jira about it (https://issues.apache.org/jira/browse/ARROW-16893) maybe one day it gets fixed. For now we just need a workaround.