Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: pipe_supertask
-
Labels:
-
Story Points:2
-
Epic Link:
-
Sprint:BG3_F18_11
-
Team:Data Access and Database
Description
Pipeline builder needs to instantiate DatasetTypes which depend on StorageClass (and they both are configured via task config). We want to keep pipeline builder independent of butler but that means that initialization of StorageClassFactory becomes an issue. DM-15850 adds support for loading all standard StorageClasses (which come from a standard YAML config) but any non-standard configuration will become an issue in this approach.
I want to see how we can solve this problem by either pre-loading non-standard config for the factory or avoiding its use entirely.
Attachments
Issue Links
- is triggered by
-
DM-15850 Standard StorageClasses should always be loaded
- Done
Could we make attachment of a StorageClass instance to a DatasetType instance something only a Butler/Registry/Datastore can do, in the same sense that a Registry is essentially necessary to add a dataset_id to a DatasetRef?
I think in most user-facing contexts only requiring a string is clearly better, but I gather not having the StorageClass attached to the DatasetType would be a pretty big annoyance in the internals (which I think is confirmed by Tim Jenness's comments above). We could consider having a variant/subclass of DatasetType that adds that information, and in more a strongly-typed or rigorously OO language I think that'd be the way to go. In Python it's less clear.