Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-1903

Implementation of calibration transformation framework

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: meas_base, pipe_tasks
    • Labels:
      None

      Description

      Following DM-1598 there will be a detailed design and prototype implementation for the calibration & ingest system. This issue covers cleaning up that code, documenting it, having it reviewed, and merging to master.

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            Hi Serge,

            Would you have time to look at this review for me?

            This is for the "calibration and ingest" system we've previously discussed. Here, I'm defining the framework that can be used to perform arbitrary transformations on the output of measurement algorithms. This story is just about setting up that infrastructure, not actually defining any of those transformations (beyond the most trivial). Note that the basic concept has already been discussed elsewhere (RFC-12).

            In tickets/DM-1903 on meas_base, I have added the ability for measurement algorithm plugins to specify an associated piece of code to transform their outputs. That code can be written in C++ or Python (examples of both are provided), and follows a standard interface.

            In tickets/DM-1903 on pipe_tasks, I have defined a task which transforms a SourceCatalog based on the configuration of the measurement algorithms which were used to generate it and their associated transformations. I've also defined a command line task which provide a simple interface to this. The expectation is that further command line tasks can be added to work with other repository types, load calibration data from different locations, etc.

            In tickets/DM-1903 on obs_test, I've modified the test mapper configuration to support storing both the outputs (as "trSrc", or transformed sources) and the configuration of the above tasks. (Rather than changing all the mappers I can find at once, I think it makes more sense to update them to support this framework as & when needed.)

            This is my first time working with much of this code, so apologies in advance for my naivety and misconceptions! If you aren't able to take a look at the moment, by all means send it back to me & I'll find somebody else.

            Thanks!

            Show
            swinbank John Swinbank added a comment - Hi Serge, Would you have time to look at this review for me? This is for the "calibration and ingest" system we've previously discussed. Here, I'm defining the framework that can be used to perform arbitrary transformations on the output of measurement algorithms. This story is just about setting up that infrastructure, not actually defining any of those transformations (beyond the most trivial). Note that the basic concept has already been discussed elsewhere ( RFC-12 ). In tickets/DM-1903 on meas_base , I have added the ability for measurement algorithm plugins to specify an associated piece of code to transform their outputs. That code can be written in C++ or Python (examples of both are provided), and follows a standard interface. In tickets/DM-1903 on pipe_tasks , I have defined a task which transforms a SourceCatalog based on the configuration of the measurement algorithms which were used to generate it and their associated transformations. I've also defined a command line task which provide a simple interface to this. The expectation is that further command line tasks can be added to work with other repository types, load calibration data from different locations, etc. In tickets/DM-1903 on obs_test , I've modified the test mapper configuration to support storing both the outputs (as "trSrc", or transformed sources) and the configuration of the above tasks. (Rather than changing all the mappers I can find at once, I think it makes more sense to update them to support this framework as & when needed.) This is my first time working with much of this code, so apologies in advance for my naivety and misconceptions! If you aren't able to take a look at the moment, by all means send it back to me & I'll find somebody else. Thanks!
            Hide
            swinbank John Swinbank added a comment -

            Perry – Serge is already looking at this, but he reckons it would also be helpful for somebody more familiar with the measurement framework to take a look. Could you oblige? Thanks!

            Show
            swinbank John Swinbank added a comment - Perry – Serge is already looking at this, but he reckons it would also be helpful for somebody more familiar with the measurement framework to take a look. Could you oblige? Thanks!
            Hide
            pgee Perry Gee added a comment -

            I am looking at this today. Sorry for the delay. I am primarily going to review issues concerning meas_base

            Show
            pgee Perry Gee added a comment - I am looking at this today. Sorry for the delay. I am primarily going to review issues concerning meas_base
            Hide
            smonkewitz Serge Monkewitz added a comment -

            I've left my few minor quibbles on the github PRs. I also asked a few questions just for my understanding - answering those need not hold up this ticket (I'm sure things are busy as W15 draws to a close).

            Overall - kudos on the very readable documentation and thorough unit tests especially. Great job!

            Show
            smonkewitz Serge Monkewitz added a comment - I've left my few minor quibbles on the github PRs. I also asked a few questions just for my understanding - answering those need not hold up this ticket (I'm sure things are busy as W15 draws to a close). Overall - kudos on the very readable documentation and thorough unit tests especially. Great job!
            Hide
            pgee Perry Gee added a comment -

            Design comments:

            Yes, the code is nicely written. I do have a few comments. I did not read the code line-for-line, as I think that is what Serge was doing. But I did look at how it fit into meas_base, and particularly into the plugin structure.

            In your test examples, you use a single plugin and test various TransformClasses with it. But it seems like the design for the use of the Transform Classes only allows one transform to be used with each plugin. The reason is that the call to the Wrapper class which initializes the TransformClass is generally in the initialization of the module in which the plugin lives, which would seem to mean that when a transform is used with a given plugin, it must be the one registered. I have no quibble with that design, or at least I don't know enough about use cases to quibble. But I'm curious to know if it was your intention to allow the plugin and the TransformClass to be both coupled to each other (as in the pipe_task framework), and uncoupled (as in the testTransform.py tests). Same comment about the relationship between the field name which is given as the "name" argument on construction of the Transform, and the name of the plugin. They seem to be uncoupled in the same way.

            I'm not sure why NullTransform is the default for getTransformClass(). On the other hand, I am not clear how a user is going to actually use this capability – to transform some measurement fields while leaving the fields which do not need transformation alone, or by creating a separate table just for the ones which need a transform?

            testTransform.py

            why doesn't the testPassThroughTransform do field value tests?

            Does the fact that the calib argument of the transform operator has to be constructed even if it is not used indicate that it should be an optional parameter?

            Show
            pgee Perry Gee added a comment - Design comments: Yes, the code is nicely written. I do have a few comments. I did not read the code line-for-line, as I think that is what Serge was doing. But I did look at how it fit into meas_base, and particularly into the plugin structure. In your test examples, you use a single plugin and test various TransformClasses with it. But it seems like the design for the use of the Transform Classes only allows one transform to be used with each plugin. The reason is that the call to the Wrapper class which initializes the TransformClass is generally in the initialization of the module in which the plugin lives, which would seem to mean that when a transform is used with a given plugin, it must be the one registered. I have no quibble with that design, or at least I don't know enough about use cases to quibble. But I'm curious to know if it was your intention to allow the plugin and the TransformClass to be both coupled to each other (as in the pipe_task framework), and uncoupled (as in the testTransform.py tests). Same comment about the relationship between the field name which is given as the "name" argument on construction of the Transform, and the name of the plugin. They seem to be uncoupled in the same way. I'm not sure why NullTransform is the default for getTransformClass(). On the other hand, I am not clear how a user is going to actually use this capability – to transform some measurement fields while leaving the fields which do not need transformation alone, or by creating a separate table just for the ones which need a transform? testTransform.py why doesn't the testPassThroughTransform do field value tests? Does the fact that the calib argument of the transform operator has to be constructed even if it is not used indicate that it should be an optional parameter?
            Hide
            pgee Perry Gee added a comment -

            Oh, and carry on! The code is great and should be merged, with these considerations!

            Show
            pgee Perry Gee added a comment - Oh, and carry on! The code is great and should be merged, with these considerations!
            Hide
            swinbank John Swinbank added a comment -

            Hi Serge, Perry –

            Thanks for the reviews. I've not had chance to go through your comments in detail yet, but I'll aim to respond and merge this in the next day or so.

            Show
            swinbank John Swinbank added a comment - Hi Serge, Perry – Thanks for the reviews. I've not had chance to go through your comments in detail yet, but I'll aim to respond and merge this in the next day or so.
            Hide
            swinbank John Swinbank added a comment -

            I've made some changes and responded to Serge's comments where appropriate on GitHub. Here's a response to Perry:

            ...
            But I'm curious to know if it was your intention to allow the plugin and the TransformClass to be both coupled to each other (as in the pipe_task framework), and uncoupled (as in the testTransform.py tests). Same comment about the relationship between the field name which is given as the "name" argument on construction of the Transform, and the name of the plugin. They seem to be uncoupled in the same way.

            Right, good question.

            The expectation is that there is a single, well defined transform which is appropriate for the results of a particular measurement. Thus, there is a tight coupling from measurement algorithm to transformation.

            This is not true in reverse: a given transform may be applicable to the results of more than one measurement. For example, a simple conversion of flux to magnitude could be appropriate for a bunch of different flux measurements.

            However, even though in production I think there should always be a link from a given measurement to its associated transform, that's not always the case in testing: it's much easier to write good tests for things if they can be broken down into the smallest possible pieces, which is what I tried to do here.

            I'm not sure why NullTransform is the default for getTransformClass().

            Serge raised the same question; I've changed it to PassThroughTransform.

            why doesn't the testPassThroughTransform do field value tests?

            Good question. It does now.

            Does the fact that the calib argument of the transform operator has to be constructed even if it is not used indicate that it should be an optional parameter?

            Considering the transform stand-alone, I think that's true – things like Null or PassThrough clearly don't need a WCS and calibration. But the task interface will always provide those arguments, and I don't think that (outside test cases) the transforms will usually be accessed in other ways, so I'm not sure the flexibility is necessary; I tend to prefer less code and a rigid interface unless we actually have a good reason to do something else.

            Thanks to both of you for the helpful comments!

            Show
            swinbank John Swinbank added a comment - I've made some changes and responded to Serge's comments where appropriate on GitHub. Here's a response to Perry: ... But I'm curious to know if it was your intention to allow the plugin and the TransformClass to be both coupled to each other (as in the pipe_task framework), and uncoupled (as in the testTransform.py tests). Same comment about the relationship between the field name which is given as the "name" argument on construction of the Transform, and the name of the plugin. They seem to be uncoupled in the same way. Right, good question. The expectation is that there is a single, well defined transform which is appropriate for the results of a particular measurement. Thus, there is a tight coupling from measurement algorithm to transformation. This is not true in reverse: a given transform may be applicable to the results of more than one measurement. For example, a simple conversion of flux to magnitude could be appropriate for a bunch of different flux measurements. However, even though in production I think there should always be a link from a given measurement to its associated transform, that's not always the case in testing: it's much easier to write good tests for things if they can be broken down into the smallest possible pieces, which is what I tried to do here. I'm not sure why NullTransform is the default for getTransformClass(). Serge raised the same question; I've changed it to PassThroughTransform . why doesn't the testPassThroughTransform do field value tests? Good question. It does now. Does the fact that the calib argument of the transform operator has to be constructed even if it is not used indicate that it should be an optional parameter? Considering the transform stand-alone, I think that's true – things like Null or PassThrough clearly don't need a WCS and calibration. But the task interface will always provide those arguments, and I don't think that (outside test cases) the transforms will usually be accessed in other ways, so I'm not sure the flexibility is necessary; I tend to prefer less code and a rigid interface unless we actually have a good reason to do something else. Thanks to both of you for the helpful comments!

              People

              Assignee:
              swinbank John Swinbank
              Reporter:
              swinbank John Swinbank
              Reviewers:
              Perry Gee
              Watchers:
              John Swinbank, Perry Gee, Serge Monkewitz
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: