Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Team:Data Release Production
-
Urgent?:No
Description
The new InMemoryDatasetHandle needs a delegate to be able to read columns from a DataFrame.
This is straightforward to add for simple use cases (specifically, getting a list of columns that are specified with a list of strings). The full Parquet formatter supports various index tuples as well. I leave supporting this to the future if it becomes necessary; in the meantime it will raise a NotImplementedError.
When this is done it can be used in tests like https://github.com/lsst/pipe_tasks/blob/main/tests/test_isolatedStarAssociation.py
These changes look fine to me although I don't like all the code duplication between formatter tests and delegate tests.
I think you might get the same test coverage in a simpler way if your delegate test case is a class that inherits from the formatter one but has a different setUp that configures an in-memory datastore rather than a file datastore. This should trigger all the delegate code (which can be checked in the code coverage). You can then have additional test code for the error conditions and the storage class finding.
Regarding the findStorageClass testing, since the code short circuits if the DataFrame storage class python type has already been loaded (which will have been done in the other tests) it is likely not testing what you think it is testing because compare_types has no effect if the type is already known. You may need to change the test to first get the DataFrame storage class from the factory and then force its pytype to be None.