Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-419

Use aliases in slots

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw
    • Labels:

      Attachments

        Issue Links

          Activity

          Hide
          jbosch Jim Bosch added a comment -

          Simon, could you take a look at this? It's rather large, but not huge: it's a rewrite of how the slot mechanism in SourceTable works, to make use of the new alias mechanism added just before.

          I've squashed commits in order to make them a bit cleaner, and you might find it easier to review this one commit at a time rather than looking at the full diff at once (essentially none of the commits overlap, but this way the git logs will give you an idea of what each changeset was for).

          However, you probably want to start with the docstring for the SlotDefinition class in afw/table/slots.h, as that gives a pretty good overview of how this is all supposed to work.

          Changes are limited to afw, and all on branch u/jbosch/DM-419:

          afw:u/jbosch/DM-419 % git diff --stat master...u/jbosch/DM-419
           include/lsst/afw/table/Schema.h    |    4 +-
           include/lsst/afw/table/Source.h.m4 |  396 +++++++++---------------------------
           include/lsst/afw/table/slots.h     |  287 ++++++++++++++++++++++++++
           python/lsst/afw/table/Match.i      |   23 ---
           python/lsst/afw/table/Source.i     |   27 +--
           src/table/AliasMap.cc              |    2 +-
           src/table/Source.cc                |  237 ++-------------------
           src/table/io/FitsReader.cc         |   27 +++
           src/table/slots.cc                 |  197 ++++++++++++++++++
           tests/data/slotsVersion0.fits      |    1 +
           tests/testSourceTable.py           |   58 ++++--
           tests/ticketDM-433.py              |   34 ++--
           12 files changed, 678 insertions(+), 615 deletions(-)

          Show
          jbosch Jim Bosch added a comment - Simon, could you take a look at this? It's rather large, but not huge: it's a rewrite of how the slot mechanism in SourceTable works, to make use of the new alias mechanism added just before. I've squashed commits in order to make them a bit cleaner, and you might find it easier to review this one commit at a time rather than looking at the full diff at once (essentially none of the commits overlap, but this way the git logs will give you an idea of what each changeset was for). However, you probably want to start with the docstring for the SlotDefinition class in afw/table/slots.h, as that gives a pretty good overview of how this is all supposed to work. Changes are limited to afw, and all on branch u/jbosch/ DM-419 : afw:u/jbosch/DM-419 % git diff --stat master...u/jbosch/DM-419 include/lsst/afw/table/Schema.h | 4 +- include/lsst/afw/table/Source.h.m4 | 396 +++++++++--------------------------- include/lsst/afw/table/slots.h | 287 ++++++++++++++++++++++++++ python/lsst/afw/table/Match.i | 23 --- python/lsst/afw/table/Source.i | 27 +-- src/table/AliasMap.cc | 2 +- src/table/Source.cc | 237 ++------------------- src/table/io/FitsReader.cc | 27 +++ src/table/slots.cc | 197 ++++++++++++++++++ tests/data/slotsVersion0.fits | 1 + tests/testSourceTable.py | 58 ++++-- tests/ticketDM-433.py | 34 ++-- 12 files changed, 678 insertions(+), 615 deletions(-)
          Hide
          jbosch Jim Bosch added a comment -

          Just added a GitHub PR link to this issue page to help with the review.

          Show
          jbosch Jim Bosch added a comment - Just added a GitHub PR link to this issue page to help with the review.
          Hide
          krughoff Simon Krughoff added a comment -

          Comments on the GitHub PR. Good to merge.

          Show
          krughoff Simon Krughoff added a comment - Comments on the GitHub PR. Good to merge.
          Hide
          jbosch Jim Bosch added a comment -

          Merged to master. Tests in meas_astrom revealed an oversight in one feature, which I fixed just before merge (one-line change).

          Show
          jbosch Jim Bosch added a comment - Merged to master. Tests in meas_astrom revealed an oversight in one feature, which I fixed just before merge (one-line change).
          Hide
          ktl Kian-Tat Lim added a comment -

          Please summarize the comments from GitHub here; if GH were to go away, we'd still want some kind of record of what happened.

          Show
          ktl Kian-Tat Lim added a comment - Please summarize the comments from GitHub here; if GH were to go away, we'd still want some kind of record of what happened.
          Hide
          jbosch Jim Bosch added a comment -

          Aside from several trivial doc fixes, the review mostly discussed the question of whether the efforts at backwards compatibility were worthwhile:
          Simon Krughoff:

          I notice a lot of work goes into making sure that version 0 tables work. I realize that backward compatibility is ideal, but we have frequently broken backward compatibility. Is it necessary to go to these lengths to make version 0 tables work, or am I missing the point of version 0 tables?

          Jim Bosch:

          If this was just about the transition period before meas_base can do everything the old meas_algorithms could do, I'd be more inclined to agree. But this is also necessary to allow outputs produced by older versions of the pipeline to be read by newer versions of the pipeline. Eventually I hope we can remove the backwards compatibility layer for that too, but I imagine that could be a long time after we've made the transition, and there's no longer any old outputs we care about being able to read.

          Robert Lupton:

          Couldn't we do this at the butler level? Read the old inputs and rename the columns?

          Jim Bosch:

          We could, I suppose. That seems harder, because it won't always be a one-to-one mapping between new fields and old fields. It would also make it less useful for the immediate (but more temporary) purpose of easing the transition to meas_base.

          If we were to translate the old catalogs into new ones, I think it'd work best as a script you'd run to convert an entire data repository, at which point we could force the user to provide additional information (e.g. via config) to clear up any ambiguities in the mapping. But I'd rather not worry about that until the transition is over. It may have been a mistake to work this hard on this kind of backwards compatibility from the beginning, but I definitely don't want to switch to a different strategy right now.

          Robert Lupton:

          I agree that it's too late.

          Note that my on-the-fly solution would allow old and new versions of the code to both read the files.

          Jim Bosch:

          True, and that'd definitely be desirable. It'd be difficult to implement at the Butler level, though, in the sense that all the Butler does with these things is pass a filename to the afw FITS-reading code. We'd either need to read it in and then write it back out to a temporary file using low-level FITS manipulations (using e.g. PyFITS), before passing it to afw, or read it into an old-style afw::table object in memory before doing the mapping to a new-style afw::table object (in which case we'd need at least some of this version code to support that temporary).

          Show
          jbosch Jim Bosch added a comment - Aside from several trivial doc fixes, the review mostly discussed the question of whether the efforts at backwards compatibility were worthwhile: Simon Krughoff : I notice a lot of work goes into making sure that version 0 tables work. I realize that backward compatibility is ideal, but we have frequently broken backward compatibility. Is it necessary to go to these lengths to make version 0 tables work, or am I missing the point of version 0 tables? Jim Bosch : If this was just about the transition period before meas_base can do everything the old meas_algorithms could do, I'd be more inclined to agree. But this is also necessary to allow outputs produced by older versions of the pipeline to be read by newer versions of the pipeline. Eventually I hope we can remove the backwards compatibility layer for that too, but I imagine that could be a long time after we've made the transition, and there's no longer any old outputs we care about being able to read. Robert Lupton : Couldn't we do this at the butler level? Read the old inputs and rename the columns? Jim Bosch : We could, I suppose. That seems harder, because it won't always be a one-to-one mapping between new fields and old fields. It would also make it less useful for the immediate (but more temporary) purpose of easing the transition to meas_base. If we were to translate the old catalogs into new ones, I think it'd work best as a script you'd run to convert an entire data repository, at which point we could force the user to provide additional information (e.g. via config) to clear up any ambiguities in the mapping. But I'd rather not worry about that until the transition is over. It may have been a mistake to work this hard on this kind of backwards compatibility from the beginning, but I definitely don't want to switch to a different strategy right now. Robert Lupton : I agree that it's too late. Note that my on-the-fly solution would allow old and new versions of the code to both read the files. Jim Bosch : True, and that'd definitely be desirable. It'd be difficult to implement at the Butler level, though, in the sense that all the Butler does with these things is pass a filename to the afw FITS-reading code. We'd either need to read it in and then write it back out to a temporary file using low-level FITS manipulations (using e.g. PyFITS), before passing it to afw, or read it into an old-style afw::table object in memory before doing the mapping to a new-style afw::table object (in which case we'd need at least some of this version code to support that temporary).
          Hide
          rhl Robert Lupton added a comment -

          I was confused by Jim's statement:

          read it into an old-style afw::table object in memory before doing the mapping to a new-style afw::table object (in which case we'd need at least some of this version code to support that temporary).

          I had thought that the only change was the names of columns, in which case this would just be aliases, but actually I seem to remember that the representation of e.g. Shapes changed, but the data must still be there so a slight extension to the alias mechanism should allow us to reinterpret the old tables as new ones in place. But this may in fact be very messy.

          This approach has the great advantage of being totally separate from the meas code; it'd be a conversion that runs somewhere in the butler.

          I am agnostic on whether it's too late to switch to doing things this way.

          Show
          rhl Robert Lupton added a comment - I was confused by Jim's statement: read it into an old-style afw::table object in memory before doing the mapping to a new-style afw::table object (in which case we'd need at least some of this version code to support that temporary). I had thought that the only change was the names of columns, in which case this would just be aliases, but actually I seem to remember that the representation of e.g. Shapes changed, but the data must still be there so a slight extension to the alias mechanism should allow us to reinterpret the old tables as new ones in place. But this may in fact be very messy. This approach has the great advantage of being totally separate from the meas code; it'd be a conversion that runs somewhere in the butler. I am agnostic on whether it's too late to switch to doing things this way.
          Hide
          jbosch Jim Bosch added a comment -

          My comment was essentially saying that I didn't know how to implement a field name change in the Butler without reading the file in first and modifying it in-memory. But I hadn't using considered aliases for this part of the backwards compatibility; that's a good idea worth looking into, and it might indeed let us do it at the Butler level. It doesn't seem like it should be too hard to make it work with the new approach to compound fields, but I haven't looked at it too closely.

          I think for now it'll be best to proceed on our current path, since the work is essentially done, but when the time comes to actually remove the old versions of the algorithms in meas_algorithms, we should consider switching to this proposal for keeping the backwards compatibility for already-saved catalogs instead of what's in afw::table now.

          Show
          jbosch Jim Bosch added a comment - My comment was essentially saying that I didn't know how to implement a field name change in the Butler without reading the file in first and modifying it in-memory. But I hadn't using considered aliases for this part of the backwards compatibility; that's a good idea worth looking into, and it might indeed let us do it at the Butler level. It doesn't seem like it should be too hard to make it work with the new approach to compound fields, but I haven't looked at it too closely. I think for now it'll be best to proceed on our current path, since the work is essentially done, but when the time comes to actually remove the old versions of the algorithms in meas_algorithms, we should consider switching to this proposal for keeping the backwards compatibility for already-saved catalogs instead of what's in afw::table now.

            People

            • Assignee:
              jbosch Jim Bosch
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Simon Krughoff
              Watchers:
              Jim Bosch, Kian-Tat Lim, Robert Lupton, Simon Krughoff
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Summary Panel