Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-433

Add slot support for meas_base-style outputs

    XMLWordPrintable

    Details

      Description

      The slot mechanism in afwTable currently uses compound fields to save the 3 slot types: flux, centroid, and shape. Since the new measurement framework uses a flattened representation in the SourceTable where these types are saved as multiple scalar fields, the slot mechanism need to be altered to handle this new table type.

      1. An alternative to KeyTuple for storing the keys required by the slot
      2. Fixup get(Centroid, Flux, Shape) in SourceRecord to use correct keys.
      3. Fixup the single value getters (getX, getY, etc) to use the correct keys.
      4. Persist slot info to fits correctly, based on table version.

        Attachments

          Issue Links

            Activity

            Hide
            pgee Perry Gee added a comment -

            I have decided to add the functorKeys work to this ticket.

            I was able to get my original implementation without functorKeys working, but as the code will be prettier with functorKeys, I have elected to do that now.

            I think that will add 2 story points to this issue.

            Show
            pgee Perry Gee added a comment - I have decided to add the functorKeys work to this ticket. I was able to get my original implementation without functorKeys working, but as the code will be prettier with functorKeys, I have elected to do that now. I think that will add 2 story points to this issue.
            Hide
            pgee Perry Gee added a comment -

            I am about to put this into review. I just need to do a little cleanup. And I am waiting for suggestions on the issue about the sourceMatch.py test in AFW.

            I have moved the code in Source.h to functorKeys. It was worth the extra effort, as the code looks much better with this change. I also added some unit tests which include functorKeys, though they are probably a little circular ( since the slots use functorKeys, it would probably we odd if the slots did not match fetching the data with functorKeys). This all added about 1.5 story points to this issue.

            I will add the discussion about the issue I brought up with Jim today about what to do if only part of an error (sigma vector) or covariance (off-diagonal pieces of the covariance matrix) are defined in a schema. We really expect n sigma fields for an n-dimensional error, and (n*n-n)/2 off-diagonal terms. Jim's response is the if there are any missing sigma fields, we should not define an ErrorKey. If any of the off-diagonal terms are missing, we should set them to zero.

            I have also added persistence tests for both version 0 and version 1 tables to afw. Just in case the changes to the persistence macro broke anything.

            Show
            pgee Perry Gee added a comment - I am about to put this into review. I just need to do a little cleanup. And I am waiting for suggestions on the issue about the sourceMatch.py test in AFW. I have moved the code in Source.h to functorKeys. It was worth the extra effort, as the code looks much better with this change. I also added some unit tests which include functorKeys, though they are probably a little circular ( since the slots use functorKeys, it would probably we odd if the slots did not match fetching the data with functorKeys). This all added about 1.5 story points to this issue. I will add the discussion about the issue I brought up with Jim today about what to do if only part of an error (sigma vector) or covariance (off-diagonal pieces of the covariance matrix) are defined in a schema. We really expect n sigma fields for an n-dimensional error, and (n*n-n)/2 off-diagonal terms. Jim's response is the if there are any missing sigma fields, we should not define an ErrorKey. If any of the off-diagonal terms are missing, we should set them to zero. I have also added persistence tests for both version 0 and version 1 tables to afw. Just in case the changes to the persistence macro broke anything.
            Hide
            pgee Perry Gee added a comment -

            Given that this needs to be done pretty quickly, you are the only logical candidate. Hopefully, this is the last big review for you.

            Some notes:
            --I added some new defineSlot routines using functorKeys, but was unable to get them to appear in Python. The swig begin generated is odd looking, and I am not sure what is causing the defineCentroid and defineShape routines to get defined the way they are in Python. Perhaps you can help.
            --I would like to get Source.h back to looking like it did before I started, with nice automatically defined typedefs for each slot type. But I don't think I can do this the way the FunctorKeys in aggregates.h are structured. So I gave up and will push this to a later ticket, after I understand how much of the old slot and field structure we plan to retain.
            --testPsfFlux.py seems to be broken. Not sure why. I don't think it is this checkin, as I see the failure with earlier afw branches.
            --Changed the default version to 0 for now.

            Show
            pgee Perry Gee added a comment - Given that this needs to be done pretty quickly, you are the only logical candidate. Hopefully, this is the last big review for you. Some notes: --I added some new defineSlot routines using functorKeys, but was unable to get them to appear in Python. The swig begin generated is odd looking, and I am not sure what is causing the defineCentroid and defineShape routines to get defined the way they are in Python. Perhaps you can help. --I would like to get Source.h back to looking like it did before I started, with nice automatically defined typedefs for each slot type. But I don't think I can do this the way the FunctorKeys in aggregates.h are structured. So I gave up and will push this to a later ticket, after I understand how much of the old slot and field structure we plan to retain. --testPsfFlux.py seems to be broken. Not sure why. I don't think it is this checkin, as I see the failure with earlier afw branches. --Changed the default version to 0 for now.
            Hide
            pgee Perry Gee added a comment -

            The changes are in tickets/DM-433 in afw.

            The unit tests in meas_base were also altered in tickets/DM-433 to work with the change in version default in BaseTable.

            I broke Jim's testPsfFlux.py, or it was previously broken. Not sure why.

            Show
            pgee Perry Gee added a comment - The changes are in tickets/ DM-433 in afw. The unit tests in meas_base were also altered in tickets/ DM-433 to work with the change in version default in BaseTable. I broke Jim's testPsfFlux.py, or it was previously broken. Not sure why.
            Hide
            jbosch Jim Bosch added a comment -

            So, as you've warned me, this is messy, but I think you did the best you could given the constraints. We'll be able to clean it up quite a bit when we've added aliases, and I think moving some of the logic into the FunctorKeys could also help with that. I don't think we want to do any of this now (even though I think we could do some of this without aliases), but so you have an idea of where I'd like to take this in the future, here's what I'm thinking, given my own vague ideas and the reality of making these coexist that I see in your code:

            • Adding aliases will remove the need to store the string field names for the slots in the SourceTable. That will also become how slots are persisted, so a lot of the code you struggled with in that department will simply go away. Unfortunately, it will be replaced be a different problem: the slots will then always be defined to correspond to certain strings that are aliase in the schema (e.g. "slots_PsfFlux"), but what Keys those names relate to can change during the lifetime of the table, so we'll need to have some ability to tell the table "your slots have changed, go look up the keys again".
            • I think we should make it so the CovarianceMatrixKey FunctorKey has an option to either square or not square the diagonal. That will make it compatible with the old covariance matrix compound keys as well as the new one. We can then make the covariance matrices, points, and quadrupole results always handled by FunctorKeys (even for the old-style slots), by adding constructors-from-compound-keys to the FunctorKey classes. I think that will clean up the BaseTable data members. We could do this part now, but I don't think it's worth it; better to get what you have on master now so you can proceed with the meas_base work, and clean this up when we clean up the rest.
            • I don't think we need to add FunctorKeys for scalar values like flux; I'm okay with the Centroid/Shape slots being based on FunctorKeys and the flux slots not being based on FunctorKeys - adding a FunctorKey for Flux seems like forcing something unnatural, now that's we're really considering compound fields to be intrinsically different from scalar fields.

            I'm still working on the rest of the review. Expect more comments in a bit.

            Show
            jbosch Jim Bosch added a comment - So, as you've warned me, this is messy, but I think you did the best you could given the constraints. We'll be able to clean it up quite a bit when we've added aliases, and I think moving some of the logic into the FunctorKeys could also help with that. I don't think we want to do any of this now (even though I think we could do some of this without aliases), but so you have an idea of where I'd like to take this in the future, here's what I'm thinking, given my own vague ideas and the reality of making these coexist that I see in your code: Adding aliases will remove the need to store the string field names for the slots in the SourceTable. That will also become how slots are persisted, so a lot of the code you struggled with in that department will simply go away. Unfortunately, it will be replaced be a different problem: the slots will then always be defined to correspond to certain strings that are aliase in the schema (e.g. "slots_PsfFlux"), but what Keys those names relate to can change during the lifetime of the table, so we'll need to have some ability to tell the table "your slots have changed, go look up the keys again". I think we should make it so the CovarianceMatrixKey FunctorKey has an option to either square or not square the diagonal. That will make it compatible with the old covariance matrix compound keys as well as the new one. We can then make the covariance matrices, points, and quadrupole results always handled by FunctorKeys (even for the old-style slots), by adding constructors-from-compound-keys to the FunctorKey classes. I think that will clean up the BaseTable data members. We could do this part now, but I don't think it's worth it; better to get what you have on master now so you can proceed with the meas_base work, and clean this up when we clean up the rest. I don't think we need to add FunctorKeys for scalar values like flux; I'm okay with the Centroid/Shape slots being based on FunctorKeys and the flux slots not being based on FunctorKeys - adding a FunctorKey for Flux seems like forcing something unnatural, now that's we're really considering compound fields to be intrinsically different from scalar fields. I'm still working on the rest of the review. Expect more comments in a bit.
            Hide
            jbosch Jim Bosch added a comment -

            And here's the actual code review:

            • In if/else blocks, our preferred style is:

              if (...) {
                  ...
              } else {
                  ...
              }

              not

              if (...) {
                  ...
              }
              else {
                  ...
              }

              In addition, please use braces for any if block that contains an else clause or in which the block contains more than one line (I've gotten in the habit of just always using braces, as the case we're not required to is very restrictive).

            • When constructing the FunctorKeys for new-style slots, please use the constructors that just take a partial field name, rather than the ones that take Keys directly. I tried to make it so the ones that take partial field names do exactly what you want; if they don't, we should change them so they do.
            • The docstrings for the defineX() methods should be updated to indicate that what's currently there only applies to version 0, and the name conventions for version > 0 are different.
            • The reason the Swig wrappers for your new defineX() methods don't work is because of some customization I had done of the old ones; see around line 74 of Source.i. That code allows you to pass keyword arguments and use None when calling the old defineX() methods (e.g. "defineCentroid(key, flag=None, err=errKey)"). Personally, I'd be happy to just remove the new ones, as I think ultimately the one that just takes a field name is all we really want to support. But if those are useful for testing or something I can help try to get them working in Python.
            • I don't see a test failure in testPsfFlux.py in meas_base. Perhaps its something intermittent, or a setup problem?
            • At some point you mentioned a problem with converting code in sourceMatch.py, but what I see there looks fine, and I can't find your original statement on that. Has that been resolved?
            Show
            jbosch Jim Bosch added a comment - And here's the actual code review: In if/else blocks, our preferred style is: if (...) { ... } else { ... } not if (...) { ... } else { ... } In addition, please use braces for any if block that contains an else clause or in which the block contains more than one line (I've gotten in the habit of just always using braces, as the case we're not required to is very restrictive). When constructing the FunctorKeys for new-style slots, please use the constructors that just take a partial field name, rather than the ones that take Keys directly. I tried to make it so the ones that take partial field names do exactly what you want; if they don't, we should change them so they do. The docstrings for the defineX() methods should be updated to indicate that what's currently there only applies to version 0, and the name conventions for version > 0 are different. The reason the Swig wrappers for your new defineX() methods don't work is because of some customization I had done of the old ones; see around line 74 of Source.i. That code allows you to pass keyword arguments and use None when calling the old defineX() methods (e.g. "defineCentroid(key, flag=None, err=errKey)"). Personally, I'd be happy to just remove the new ones, as I think ultimately the one that just takes a field name is all we really want to support. But if those are useful for testing or something I can help try to get them working in Python. I don't see a test failure in testPsfFlux.py in meas_base. Perhaps its something intermittent, or a setup problem? At some point you mentioned a problem with converting code in sourceMatch.py, but what I see there looks fine, and I can't find your original statement on that. Has that been resolved?
            Hide
            pgee Perry Gee added a comment -

            I don't see the FunctorKey constructor you are talking about. I assume you mean something like Point2DKey(std::string const &)?

            I still see a failure in testPsfFlux.py. Did you check out the DM-433 ticket branch? I don't see it on the master branch, which probably means that I caused it, but the problem was very odd (having to do with the bounding boxes not matching). SInce this was your test, I thought you might be able to spot it more easily.

            The problem with sourceMatch only appears when the default version is 1. You have to change BaseTable.cc to see it.

            I'm not sure how to fix this problem with Swig. In C++, the parameter signatures disambiguate the defineSlot methods. For now, I am going to rename the methods which take FunctorKey parameters.

            Show
            pgee Perry Gee added a comment - I don't see the FunctorKey constructor you are talking about. I assume you mean something like Point2DKey(std::string const &)? I still see a failure in testPsfFlux.py. Did you check out the DM-433 ticket branch? I don't see it on the master branch, which probably means that I caused it, but the problem was very odd (having to do with the bounding boxes not matching). SInce this was your test, I thought you might be able to spot it more easily. The problem with sourceMatch only appears when the default version is 1. You have to change BaseTable.cc to see it. I'm not sure how to fix this problem with Swig. In C++, the parameter signatures disambiguate the defineSlot methods. For now, I am going to rename the methods which take FunctorKey parameters.
            Hide
            jbosch Jim Bosch added a comment -

            I don't see the FunctorKey constructor you are talking about. I assume you mean something like Point2DKey(std::string const &)?

            Oops, I actually meant the one that takes a SubSchema (which is just a combination of a Schema and string: it's what you get when you say schema[string]).

            I still see a failure in testPsfFlux.py. Did you check out the DM-433 ticket branch? I don't see it on the master branch, which probably means that I caused it, but the problem was very odd (having to do with the bounding boxes not matching). SInce this was your test, I thought you might be able to spot it more easily.

            I didn't realize there was a branch for meas_base (this is one reason we typically paste the output of "git diff --stat" when requesting a review). Now that I'm trying that, I'm seeing all the tests fail. Are there any other branches I need to set up? In the meantime, I'll rebuild everything against the latest afw; maybe that's the problem.

            The problem with sourceMatch only appears when the default version is 1. You have to change BaseTable.cc to see it.

            Ah, ok. So it's not a problem right now, but it will be again in the future when we change the default version.

            I'm not sure how to fix this problem with Swig. In C++, the parameter signatures disambiguate the defineSlot methods. For now, I am going to rename the methods which take FunctorKey parameters.

            As I said, I don't think we need to fix it. I'd prefer to just remove those methods, and just leave the ones that take strings. Do you need them for anything?

            Show
            jbosch Jim Bosch added a comment - I don't see the FunctorKey constructor you are talking about. I assume you mean something like Point2DKey(std::string const &)? Oops, I actually meant the one that takes a SubSchema (which is just a combination of a Schema and string: it's what you get when you say schema[string] ). I still see a failure in testPsfFlux.py. Did you check out the DM-433 ticket branch? I don't see it on the master branch, which probably means that I caused it, but the problem was very odd (having to do with the bounding boxes not matching). SInce this was your test, I thought you might be able to spot it more easily. I didn't realize there was a branch for meas_base (this is one reason we typically paste the output of " git diff --stat " when requesting a review). Now that I'm trying that, I'm seeing all the tests fail. Are there any other branches I need to set up? In the meantime, I'll rebuild everything against the latest afw; maybe that's the problem. The problem with sourceMatch only appears when the default version is 1. You have to change BaseTable.cc to see it. Ah, ok. So it's not a problem right now, but it will be again in the future when we change the default version. I'm not sure how to fix this problem with Swig. In C++, the parameter signatures disambiguate the defineSlot methods. For now, I am going to rename the methods which take FunctorKey parameters. As I said, I don't think we need to fix it. I'd prefer to just remove those methods, and just leave the ones that take strings. Do you need them for anything?
            Hide
            pgee Perry Gee added a comment -

            Yes, I should have included the git diff --stat. But I did say in the previous comment that meas_base also had a DM-433 change. There may also be an issue with the fact that setupTable was commented out in sfm.py. Probably my fault – it was causing problems when I tried to break my changes into pre-433 changes and post-433 changes.

            I am confused about the SubSchema defineSlot method. My assumption was that SubSchema were only for the old schema hierarchies, and that I should not make any assumptions about the organization of the components of a slot. Are you saying that you can define a SubSchema solely by name, without regard to the position of the components in the record?

            You are saying that I should remove the FunctorKeys defineSlot(FunctorKeys) methods, and do all defineSlot's by name only? l put these methods in to mirror the defineSlot(Keys) methods you had in the previous version. I thought you wanted this abstraction, but I am happy to remove it. I had the idea that we would want the FunctorKeys to replace the compound Field Keys, but perhaps that is not the case?

            Show
            pgee Perry Gee added a comment - Yes, I should have included the git diff --stat. But I did say in the previous comment that meas_base also had a DM-433 change. There may also be an issue with the fact that setupTable was commented out in sfm.py. Probably my fault – it was causing problems when I tried to break my changes into pre-433 changes and post-433 changes. I am confused about the SubSchema defineSlot method. My assumption was that SubSchema were only for the old schema hierarchies, and that I should not make any assumptions about the organization of the components of a slot. Are you saying that you can define a SubSchema solely by name, without regard to the position of the components in the record? You are saying that I should remove the FunctorKeys defineSlot(FunctorKeys) methods, and do all defineSlot's by name only? l put these methods in to mirror the defineSlot(Keys) methods you had in the previous version. I thought you wanted this abstraction, but I am happy to remove it. I had the idea that we would want the FunctorKeys to replace the compound Field Keys, but perhaps that is not the case?
            Hide
            pgee Perry Gee added a comment -

            OK, I see how the SubSchema is intended to work now. The comment about a.x a.y etc. seems to be more confining than the code actually is.

            Show
            pgee Perry Gee added a comment - OK, I see how the SubSchema is intended to work now. The comment about a.x a.y etc. seems to be more confining than the code actually is.
            Hide
            jbosch Jim Bosch added a comment -

            Actually, I realized after you asked about it that SubSchema probably won't work right now for what you want, because it does rely on periods, not underscores. To fix that, I think we'd need to move the version number from BaseTable down to Schema, which could open up new problems. So I think you should just keep using FunctorKeys the way you are, and I'll open a new ticket to figure out what to do with SubSchema (which I'll take). That probably will involve moving the version down to Schema, but I don't want that to get in the way of your work on meas_base.

            We do want the FunctorKeys to replace the compound Field Keys in general, but being able to set the slots from a set of unrelated keys was a bad idea to begin with, because we persist the slots via the names, and so we can't round-trip the slots through I/O in general if the non-name ones are what was called. If I was confident it wouldn't break anything downstream, I'd recommend moving the compound Field Key versions of defineSlot() as well.

            Any thoughts on why I'm seeing more tests fail in meas_base fail? I've tried rebuilding the whole stack with master everything and tickets/DM-433 for afw and meas_base, and I'm still seeing four test failures, not just the one you report. They seem to all be slot-related (e.g. calls to getShape() when the shape slot is not set). Could you make sure you've pushed all your commits? If you don't have any ideas, I'll debug deeper on my end.

            Show
            jbosch Jim Bosch added a comment - Actually, I realized after you asked about it that SubSchema probably won't work right now for what you want, because it does rely on periods, not underscores. To fix that, I think we'd need to move the version number from BaseTable down to Schema, which could open up new problems. So I think you should just keep using FunctorKeys the way you are, and I'll open a new ticket to figure out what to do with SubSchema (which I'll take). That probably will involve moving the version down to Schema, but I don't want that to get in the way of your work on meas_base. We do want the FunctorKeys to replace the compound Field Keys in general, but being able to set the slots from a set of unrelated keys was a bad idea to begin with, because we persist the slots via the names, and so we can't round-trip the slots through I/O in general if the non-name ones are what was called. If I was confident it wouldn't break anything downstream, I'd recommend moving the compound Field Key versions of defineSlot() as well. Any thoughts on why I'm seeing more tests fail in meas_base fail? I've tried rebuilding the whole stack with master everything and tickets/ DM-433 for afw and meas_base, and I'm still seeing four test failures, not just the one you report. They seem to all be slot-related (e.g. calls to getShape() when the shape slot is not set). Could you make sure you've pushed all your commits? If you don't have any ideas, I'll debug deeper on my end.
            Hide
            pgee Perry Gee added a comment -

            I cut and pasted the old code for defineCentroid and defineShape back into the Source.h.m4, but left the SubSchema code in, surrounded with comments.

            I removed the code to define slots from Functor Keys. I was never actually sure how it would be used, and it is not clear that it is useful.

            I am going to check to code in and merge to master as soon as one of use figures out what happened to testPsfFlux.py

            Show
            pgee Perry Gee added a comment - I cut and pasted the old code for defineCentroid and defineShape back into the Source.h.m4, but left the SubSchema code in, surrounded with comments. I removed the code to define slots from Functor Keys. I was never actually sure how it would be used, and it is not clear that it is useful. I am going to check to code in and merge to master as soon as one of use figures out what happened to testPsfFlux.py

              People

              Assignee:
              pgee Perry Gee
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Jim Bosch
              Watchers:
              Jim Bosch, Perry Gee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.