Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-21875

Add StorageClass and Formatter support necessary to persist lsst.verify.Measurement in Gen3 repos

    XMLWordPrintable

    Details

      Description

      Make it possible to persist and unpersist lsst.verify.Measurement objects via the Gen3 Butler.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            As soon as a formatter becomes big it has to be tested somehow but in many cases you don't necessarily want to depend on daf_butler so testing it is impossible (you must inherit from Formatter).

            If you're not worried about swapping out persistence libraries, then I don't see why a type-defining package can't depend on daf_butler or, hypothetically, any number of other persistence libraries that it's compatible with. The example code you linked to in daf_base keeps the Butler-specific serialization code (and yes, I realize it's not a full-fledged Formatter) external to the classes being serialized, which seems like adequate separation of concerns.

            Can you clarify exactly what problem you're trying to solve by factoring out a dependency on daf_butler? I assume it's a big one, if you're using language as strong as "impossible" or "no choice".

            Show
            krzys Krzysztof Findeisen added a comment - - edited As soon as a formatter becomes big it has to be tested somehow but in many cases you don't necessarily want to depend on daf_butler so testing it is impossible (you must inherit from Formatter). If you're not worried about swapping out persistence libraries, then I don't see why a type-defining package can't depend on daf_butler or, hypothetically, any number of other persistence libraries that it's compatible with. The example code you linked to in daf_base keeps the Butler-specific serialization code (and yes, I realize it's not a full-fledged Formatter ) external to the classes being serialized, which seems like adequate separation of concerns. Can you clarify exactly what problem you're trying to solve by factoring out a dependency on daf_butler ? I assume it's a big one, if you're using language as strong as "impossible" or "no choice".
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Please disregard my last (deleted) post. I'd have to make significant changes to the current persistence code to make Measurement self-contained, so it's no less effort than writing new functions for YAML.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Please disregard my last (deleted) post. I'd have to make significant changes to the current persistence code to make Measurement self-contained, so it's no less effort than writing new functions for YAML.
            Hide
            tjenness Tim Jenness added a comment -

            The concern on how this plays out is the following:

            • Package X has a nice class, C, in it that you want to use.
            • X only depends on core python classes.
            • You decide that C is something that should be stored in a butler repository.
            • Where do you write the Formatter?
            • Do you add it to daf_butler? If you do that you can't test it without depending on X. daf_butler should not be depending on every single thing that it can serialize (afw is not going to be a daf_butler dependency eventually).
            • Do you add the Formatter to X? This means that X now must depend on daf_butler and to test it you need to set up a test reps. Your nice standalone package now depends on lots of things you didn't need it to depend on. It may even be that X is not directly under LSST control.
            • Where does the Formatter go? If it can't go in daf_butler and it can't go in X, where? obs_base? Then obs_base has the problem we currently have of circular dependencies with things like meas_algorithms. Most formatters are only being use from tasks so should pipe_tasks be the place? I don't know.
            • Related question, should every possible datasetType storage class and formatter class be specified in daf_butler itself? daf_butler doesn't rely on this. Maybe an obs_base search path should be included?

            As for the formatter itself, the implementation should be fairly trivial. You should make _fromBytes and _toBytes implementations and that is sufficient if you subclass JsonFormatter (to be honest if you do that there is little point JsonFormatter being involved but that suggests that FileFormatter is doing it wrong).

            Show
            tjenness Tim Jenness added a comment - The concern on how this plays out is the following: Package X has a nice class, C, in it that you want to use. X only depends on core python classes. You decide that C is something that should be stored in a butler repository. Where do you write the Formatter? Do you add it to daf_butler? If you do that you can't test it without depending on X. daf_butler should not be depending on every single thing that it can serialize (afw is not going to be a daf_butler dependency eventually). Do you add the Formatter to X? This means that X now must depend on daf_butler and to test it you need to set up a test reps. Your nice standalone package now depends on lots of things you didn't need it to depend on. It may even be that X is not directly under LSST control. Where does the Formatter go? If it can't go in daf_butler and it can't go in X, where? obs_base? Then obs_base has the problem we currently have of circular dependencies with things like meas_algorithms. Most formatters are only being use from tasks so should pipe_tasks be the place? I don't know. Related question, should every possible datasetType storage class and formatter class be specified in daf_butler itself? daf_butler doesn't rely on this. Maybe an obs_base search path should be included? As for the formatter itself, the implementation should be fairly trivial. You should make _fromBytes and _toBytes implementations and that is sufficient if you subclass JsonFormatter (to be honest if you do that there is little point JsonFormatter being involved but that suggests that FileFormatter is doing it wrong).
            Hide
            krzys Krzysztof Findeisen added a comment -

            Hi Tim Jenness, can you take a look? It's 380 lines in total.

            Show
            krzys Krzysztof Findeisen added a comment - Hi Tim Jenness , can you take a look? It's 380 lines in total.
            Hide
            tjenness Tim Jenness added a comment -

            Minor comments on the PR concerning organization of the butler test code.

            Show
            tjenness Tim Jenness added a comment - Minor comments on the PR concerning organization of the butler test code.

              People

              Assignee:
              krzys Krzysztof Findeisen
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Tim Jenness
              Watchers:
              Angelo Fausti, Jim Bosch, Krzysztof Findeisen, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.