Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-1364

replace "bad data" flag in SdssCentroid

    XMLWordPrintable

    Details

      Description

      SdssCentroid has a "bad data" flag that doesn't actually convey any information about what went wrong. This should be replaced with one or more flags that provide more information.

        Attachments

          Activity

          jbosch Jim Bosch created issue -
          jbosch Jim Bosch made changes -
          Field Original Value New Value
          Epic Link DM-1100 [ 13907 ]
          jbosch Jim Bosch made changes -
          Epic Link DM-1100 [ 13907 ] DM-1769 [ 15588 ]
          swinbank John Swinbank made changes -
          Sprint Science Pipelines DM-S15-1 [ 140 ]
          Hide
          jbosch Jim Bosch added a comment -

          I'm modifying the tests for SdssCentroid on DM-1161 (as part of a change to the test utility code). Please ping me before starting this issue, unless DM-1161 has already been merged before you branch.

          Show
          jbosch Jim Bosch added a comment - I'm modifying the tests for SdssCentroid on DM-1161 (as part of a change to the test utility code). Please ping me before starting this issue, unless DM-1161 has already been merged before you branch.
          pgee Perry Gee made changes -
          Status To Do [ 10001 ] In Progress [ 3 ]
          Hide
          pgee Perry Gee added a comment -

          I'm working on this issue if you have anything to communicate about DM-1161

          I am also wondering how you feel about the style for throwing out of lower level code, that is, whether to pass the flagHandler to the lowest level of algorithmic code, or to have measure() catch specific error from that code and translate it into the correct MeasurementError.

          My preference is to pass the flagHandler so that the code all knows about the possible failure modes, but I am not sure that you agree.

          Show
          pgee Perry Gee added a comment - I'm working on this issue if you have anything to communicate about DM-1161 I am also wondering how you feel about the style for throwing out of lower level code, that is, whether to pass the flagHandler to the lowest level of algorithmic code, or to have measure() catch specific error from that code and translate it into the correct MeasurementError. My preference is to pass the flagHandler so that the code all knows about the possible failure modes, but I am not sure that you agree.
          Hide
          jbosch Jim Bosch added a comment -

          I'm working on this issue if you have anything to communicate about DM-1161

          I've changed all the unit tests quite a bit on the DM-1161 branch, but that's the only way I imagine these could conflict. I don't think this will require a huge amount of new test code, so I think it will be okay.

          I am also wondering how you feel about the style for throwing out of lower level code, that is, whether to pass the flagHandler to the lowest level of algorithmic code, or to have measure() catch specific error from that code and translate it into the correct MeasurementError.

          I think it depends on where the lower-level code lives - if this is just some utility function that's called only by one measurement algorithm, then passing the FlagHandler is fine. If it's something shared by multiple algorithms, I'm not so sure (as those would have differently-configured FlagHandlers in general), and if it's code that might be called by things other than algorithms, we should definitely just catch and re-throw exceptions.

          Show
          jbosch Jim Bosch added a comment - I'm working on this issue if you have anything to communicate about DM-1161 I've changed all the unit tests quite a bit on the DM-1161 branch, but that's the only way I imagine these could conflict. I don't think this will require a huge amount of new test code, so I think it will be okay. I am also wondering how you feel about the style for throwing out of lower level code, that is, whether to pass the flagHandler to the lowest level of algorithmic code, or to have measure() catch specific error from that code and translate it into the correct MeasurementError. I think it depends on where the lower-level code lives - if this is just some utility function that's called only by one measurement algorithm, then passing the FlagHandler is fine. If it's something shared by multiple algorithms, I'm not so sure (as those would have differently-configured FlagHandlers in general), and if it's code that might be called by things other than algorithms, we should definitely just catch and re-throw exceptions.
          Hide
          pgee Perry Gee added a comment -

          What I really hipChatted you about:

          I wanted to know if you had resolved the weirdness about error handling and bubbling from SdssShapeImpl back up through the measurement framework. It is similar to the error issues in SdssCentroid, though slightly more complex.

          The SdssShapeImpl code uses an enumeration to set a flags variable as I recall, and then assumes that the enumeration is the same as a second enumeration in SdssShape. I called out this ugliness at one time, but postponed fixing it during the port.

          Now doMeasureCentroidImpl is just a pair of anonymous routines, not a class. And one strategy is to just fold this code into the SdssCentroid Algorithm itself, which would make it possible to get to the flagHandler from this routine. Not sure if this is allowed, or if the code should stay separted. But that would be the cleanest approach.

          Another choice would be to make a CentroidImpl class and give it a flags variable like SdssShapeImpl has. I don't like this much, but would like to be consistent with whatever you are doing to SdssShape.

          Another possibility is to use the old style retval system to return from a list of enumerated errors. I actually like this pretty well, assuming that I can't just absorb this method into the class.
          --------------------------------------
          One other question. Do you envision that when a MeasurementError is thrown, the error message could be customized by the thrower, so that the bad values which caused the throw could be published in the log? You may have this already somewhere, though I don't remember seeing it.

          Show
          pgee Perry Gee added a comment - What I really hipChatted you about: I wanted to know if you had resolved the weirdness about error handling and bubbling from SdssShapeImpl back up through the measurement framework. It is similar to the error issues in SdssCentroid, though slightly more complex. The SdssShapeImpl code uses an enumeration to set a flags variable as I recall, and then assumes that the enumeration is the same as a second enumeration in SdssShape. I called out this ugliness at one time, but postponed fixing it during the port. Now doMeasureCentroidImpl is just a pair of anonymous routines, not a class. And one strategy is to just fold this code into the SdssCentroid Algorithm itself, which would make it possible to get to the flagHandler from this routine. Not sure if this is allowed, or if the code should stay separted. But that would be the cleanest approach. Another choice would be to make a CentroidImpl class and give it a flags variable like SdssShapeImpl has. I don't like this much, but would like to be consistent with whatever you are doing to SdssShape. Another possibility is to use the old style retval system to return from a list of enumerated errors. I actually like this pretty well, assuming that I can't just absorb this method into the class. -------------------------------------- One other question. Do you envision that when a MeasurementError is thrown, the error message could be customized by the thrower, so that the bad values which caused the throw could be published in the log? You may have this already somewhere, though I don't remember seeing it.
          Hide
          jbosch Jim Bosch added a comment -

          On DM-1161 I've removed SdssShapeImpl entirely, and the function that computes the moments that used to modify an SdssShapeImpl now just modifies an SdssShapeResult, so it can set the final flags there directly. I didn't go further than that and fold all of the old routines into SdssShapeAlgorithm methods, because I thought it was more important to keep the real algorithmic code intact.

          I think it probably makes sense to follow a similar path in SdssCentroid, though there you don't have a full Result object to fill (and I don't think you need one); I do think passing the FlagHandler and/or the record directly to those anonymous routines would be fine. If the anonymous routines don't need to know about all the failure modes, having them return a bool that determines whether to set a flag in the main routine would be fine too. And throwing a MeasurementError from within those routines would be fine too. I don't really have a preference between those options.

          On custom error messages for MeasurementError: I think having custom messages would be fine, as long as it's understood that those messages can really only convey debug information. So if a MeasurementError is thrown with the same flag enum value in two different places, they better mean the same thing as far as the user is concerned.

          Show
          jbosch Jim Bosch added a comment - On DM-1161 I've removed SdssShapeImpl entirely, and the function that computes the moments that used to modify an SdssShapeImpl now just modifies an SdssShapeResult, so it can set the final flags there directly. I didn't go further than that and fold all of the old routines into SdssShapeAlgorithm methods, because I thought it was more important to keep the real algorithmic code intact. I think it probably makes sense to follow a similar path in SdssCentroid, though there you don't have a full Result object to fill (and I don't think you need one); I do think passing the FlagHandler and/or the record directly to those anonymous routines would be fine. If the anonymous routines don't need to know about all the failure modes, having them return a bool that determines whether to set a flag in the main routine would be fine too. And throwing a MeasurementError from within those routines would be fine too. I don't really have a preference between those options. On custom error messages for MeasurementError: I think having custom messages would be fine, as long as it's understood that those messages can really only convey debug information. So if a MeasurementError is thrown with the same flag enum value in two different places, they better mean the same thing as far as the user is concerned.
          Hide
          pgee Perry Gee added a comment -

          I think I miscommunicated somewhere. I wasn't proposing to make the doMesureCentroidImpl call a class method of the algorithm. It is fine to me that it is a hidden implementation detail. What I really wanted to know is if it could be intimately linked to the algorithm class, which to me is a consequence of having it take a flagHander as input. That means we never plan to call it from outside the class.

          Show
          pgee Perry Gee added a comment - I think I miscommunicated somewhere. I wasn't proposing to make the doMesureCentroidImpl call a class method of the algorithm. It is fine to me that it is a hidden implementation detail. What I really wanted to know is if it could be intimately linked to the algorithm class, which to me is a consequence of having it take a flagHander as input. That means we never plan to call it from outside the class.
          Hide
          pgee Perry Gee added a comment -

          I have pushed the simplest possible change to u/pgee/DM-1364, which is to throw a MeasurementError, with the debugging info attached to the normal doc for the specific error, from the underlying code in doMeasurementCentroidImpl. I removed the generic BAD_DATA error, as it was not informative. Any unknown throws should be caught by the measurement framework, and expressed as a FAILURE.

          I do not know how to reproduce the errors I added (NO_MAXIMUM, NO_2ND_DERIVATIVE, ALMOST_NO_2ND_DERIVATIVE) with test cases. But I did hardwire the code to check that the composition of the exception messages was being done correctly.

          The anonymous doMeasurementCentroidImpl routines in SdssCentroid.cc now take flagHandlers from SdssCentroidAlgorithm, which effectively means that they can only be called from SdssCentroidAlgorithm.

          Show
          pgee Perry Gee added a comment - I have pushed the simplest possible change to u/pgee/ DM-1364 , which is to throw a MeasurementError, with the debugging info attached to the normal doc for the specific error, from the underlying code in doMeasurementCentroidImpl. I removed the generic BAD_DATA error, as it was not informative. Any unknown throws should be caught by the measurement framework, and expressed as a FAILURE. I do not know how to reproduce the errors I added (NO_MAXIMUM, NO_2ND_DERIVATIVE, ALMOST_NO_2ND_DERIVATIVE) with test cases. But I did hardwire the code to check that the composition of the exception messages was being done correctly. The anonymous doMeasurementCentroidImpl routines in SdssCentroid.cc now take flagHandlers from SdssCentroidAlgorithm, which effectively means that they can only be called from SdssCentroidAlgorithm.
          Hide
          pgee Perry Gee added a comment -

          Please note the exchange with Jim which occurred on this ticket. I was trying to discover if it was desirable to create a pattern for separating the Impl and its error handling from the Algorithm code. I believe Jim is saying that the current structure of SdssCentroid is OK as is, and that the doMeasureCentroidImpl can accept a flagHandler as an input parameter.

          I made the 3 errors thrown in this service routine into flags in the SdssCentroidAlgorithm FlagsDefinition, but also appended detailed values to the error message.

          I also removed the try/catch in the measure() method, as well as the "BAD_DATA" flag, which seems uninformative.

          Also note that only one of the templated doMeasureCentroidImpl methods is actually called here.

          Show
          pgee Perry Gee added a comment - Please note the exchange with Jim which occurred on this ticket. I was trying to discover if it was desirable to create a pattern for separating the Impl and its error handling from the Algorithm code. I believe Jim is saying that the current structure of SdssCentroid is OK as is, and that the doMeasureCentroidImpl can accept a flagHandler as an input parameter. I made the 3 errors thrown in this service routine into flags in the SdssCentroidAlgorithm FlagsDefinition, but also appended detailed values to the error message. I also removed the try/catch in the measure() method, as well as the "BAD_DATA" flag, which seems uninformative. Also note that only one of the templated doMeasureCentroidImpl methods is actually called here.
          pgee Perry Gee made changes -
          Reviewers John Swinbank [ swinbank ]
          Status In Progress [ 3 ] In Review [ 10004 ]
          Hide
          swinbank John Swinbank added a comment -

          The code submitted here is fine, I think – I made one tiny suggestion on the pull request, but there's nothing substantive.

          However, there are two other things to consider before you merge. One is that you re-write the commit message bearing in mind the advice on Confluence – the current message is too long for the initial line and also contains a misplaced apostrophe. We also don't normally mention the ticket number at the start of the message, although I don't think that's an important concern.

          Secondly, I think it would be a good idea to think again about writing some tests. I note your comment above, but I don't see why writing some simple tests should be too difficult. To get you started, I hacked together a simple test of NO_2ND_DERIVATIVE:

              def testFlagNo2ndDeriv(self):
                  self.truth.defineCentroid("truth")
                  centroid = self.truth[0].getCentroid()
                  psfImage = self.calexp.getPsf().computeImage(centroid)
                  # construct a box that won't fit the full PSF model
                  bbox = psfImage.getBBox()
                  bbox.grow(10)
                  subImage = lsst.afw.image.ExposureF(self.calexp, bbox)
                  subImage.getMaskedImage().getImage().getArray()[:] = 0
                  subCat = self.measCat[:1]
                  # we also need to install a smaller footprint, or NoiseReplacer complains before we even get to
                  # measuring the centriod
                  measRecord = subCat[0]
                  newFootprint = lsst.afw.detection.Footprint(bbox)
                  newFootprint.getPeaks().push_back(measRecord.getFootprint().getPeaks()[0])
                  measRecord.setFootprint(newFootprint)
                  # just measure the one object we've prepared for
                  self.task.measure(subCat, subImage)
                  self.assertTrue(measRecord.get("base_SdssCentroid_flag"))
                  self.assertTrue(measRecord.get("base_SdssCentroid_flag_no_2nd_derivative"))

          Note that "hacked" is really the appropriate word here – I just grabbed the extant testEdge() and butchered that into something that does the right thing; you should definitely clean it up before committing it. Doing something similar for the other flags shouldn't be too hard.

          Show
          swinbank John Swinbank added a comment - The code submitted here is fine, I think – I made one tiny suggestion on the pull request, but there's nothing substantive. However, there are two other things to consider before you merge. One is that you re-write the commit message bearing in mind the advice on Confluence – the current message is too long for the initial line and also contains a misplaced apostrophe. We also don't normally mention the ticket number at the start of the message, although I don't think that's an important concern. Secondly, I think it would be a good idea to think again about writing some tests. I note your comment above, but I don't see why writing some simple tests should be too difficult. To get you started, I hacked together a simple test of NO_2ND_DERIVATIVE : def testFlagNo2ndDeriv( self ): self .truth.defineCentroid( "truth" ) centroid = self .truth[ 0 ].getCentroid() psfImage = self .calexp.getPsf().computeImage(centroid) # construct a box that won't fit the full PSF model bbox = psfImage.getBBox() bbox.grow( 10 ) subImage = lsst.afw.image.ExposureF( self .calexp, bbox) subImage.getMaskedImage().getImage().getArray()[:] = 0 subCat = self .measCat[: 1 ] # we also need to install a smaller footprint, or NoiseReplacer complains before we even get to # measuring the centriod measRecord = subCat[ 0 ] newFootprint = lsst.afw.detection.Footprint(bbox) newFootprint.getPeaks().push_back(measRecord.getFootprint().getPeaks()[ 0 ]) measRecord.setFootprint(newFootprint) # just measure the one object we've prepared for self .task.measure(subCat, subImage) self .assertTrue(measRecord.get( "base_SdssCentroid_flag" )) self .assertTrue(measRecord.get( "base_SdssCentroid_flag_no_2nd_derivative" )) Note that "hacked" is really the appropriate word here – I just grabbed the extant testEdge() and butchered that into something that does the right thing; you should definitely clean it up before committing it. Doing something similar for the other flags shouldn't be too hard.
          swinbank John Swinbank made changes -
          Status In Review [ 10004 ] Reviewed [ 10101 ]
          Hide
          pgee Perry Gee added a comment -

          This is a perfectly reasonable request. Since I didn't really understand the reason for the almost_no_2nd_derivative error, I thought it might be better for someone who was actually writing this algorithm to come up with test cases. But I certainly can make an artificial test which produces the error by just looking at the failure case.

          Show
          pgee Perry Gee added a comment - This is a perfectly reasonable request. Since I didn't really understand the reason for the almost_no_2nd_derivative error, I thought it might be better for someone who was actually writing this algorithm to come up with test cases. But I certainly can make an artificial test which produces the error by just looking at the failure case.
          pgee Perry Gee made changes -
          Resolution Done [ 10000 ]
          Status Reviewed [ 10101 ] Done [ 10002 ]
          Hide
          jbosch Jim Bosch added a comment -

          While resolving conflicts between this and DM-1161 during a rebase, I noticed that the test for almost_no_2nd_derivative wasn't actually doing what it claimed to be doing; the test for that particular flag had been commented out, and the main failure flag was only set because the "edge" condition was being hit.

          After a lot of fiddling with the test code and trying to reproduce that error condition, I've basically given up, and just removed that test in a commit on DM-1161. Unless Perry Gee actually has a prescription in hand for triggering it (and it just got left off this issue by mistake), I'm content to leave it that way.

          I also went ahead and renamed the new flags on DM-1161 after resolving the conflicts and getting the rest of the test code working again with my changes to the test utility code. The schema naming conventions use underscores as a "group" separator, with camelCase used as a word separator - so it should be "base_SdssCentroid_flag_notAtMaximum" instead of "base_SdssCentroid_flag_not_at_maximum".

          Show
          jbosch Jim Bosch added a comment - While resolving conflicts between this and DM-1161 during a rebase, I noticed that the test for almost_no_2nd_derivative wasn't actually doing what it claimed to be doing; the test for that particular flag had been commented out, and the main failure flag was only set because the "edge" condition was being hit. After a lot of fiddling with the test code and trying to reproduce that error condition, I've basically given up, and just removed that test in a commit on DM-1161 . Unless Perry Gee actually has a prescription in hand for triggering it (and it just got left off this issue by mistake), I'm content to leave it that way. I also went ahead and renamed the new flags on DM-1161 after resolving the conflicts and getting the rest of the test code working again with my changes to the test utility code. The schema naming conventions use underscores as a "group" separator, with camelCase used as a word separator - so it should be "base_SdssCentroid_flag_notAtMaximum" instead of "base_SdssCentroid_flag_not_at_maximum".
          Hide
          robyn Robyn Allsman [X] (Inactive) added a comment -

          I would like to fix this. Is it on master or still on a ticket branch?

          Show
          robyn Robyn Allsman [X] (Inactive) added a comment - I would like to fix this. Is it on master or still on a ticket branch?
          Hide
          jbosch Jim Bosch added a comment -

          It's on master now; sorry. Probably best to just open a new ticket.

          Show
          jbosch Jim Bosch added a comment - It's on master now; sorry. Probably best to just open a new ticket.

            People

            Assignee:
            pgee Perry Gee
            Reporter:
            jbosch Jim Bosch
            Reviewers:
            John Swinbank
            Watchers:
            Jim Bosch, John Swinbank, Perry Gee, Robyn Allsman [X] (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: