Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-3198

Reduce danger of circular imports in Python due to afw table persistence

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Invalid
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw
    • Labels:
      None
    • Team:
      Data Release Production

      Description

      persistence using afw table is prone to causing Python circular imports in python. When folks at HSC added table persistence to afw.geom.Polygon they found that this caused a circular dependency in Python. They resolved this moving Polygon a level deeper. That worked, but has the unfortunate effect of moving Polygon to a surprising namespace and making it harder to access.

      This ticket asks for changes that reduce the danger of circular dependencies and allow a more natural namespace for Polygon and future classes that are persisted using afw table.

      I do not fully understand the issue, but it I saw it while moving the HSC code over, and what I observed is that "import lsst.afw.geom" imported afw table, which in turn imported afw.coord.Coord, which in turn imported afw geom. I'm not sure why adding Polygon persistence caused this, nor how we got away with the current design before Polygon adding persistence.

        Attachments

          Issue Links

            Activity

            No builds found.
            rowen Russell Owen created issue -
            rowen Russell Owen made changes -
            Field Original Value New Value
            Link This issue relates to DM-2981 [ DM-2981 ]
            Hide
            jbosch Jim Bosch added a comment -

            The problem is that some afw.table classes (e.g. SourceRecord) need access to other afw classes (like Footprint or Box), but some of those classes (like Footprint) need to inherit from afw.table.io.Persistable, or (like Box) they share a Swig module with a class (e.g. Polygon) that needs to be persistable.

            I think the right way to address this is to split afw.table into multiple libraries: a low-level library that just deals with POD data (enough to support persistence), and one or more higher-level library with specific subclasses like SourceRecord, AmpInfoRecord, and ExposureRecord, which need access to e.g. geometry classes. We could also do this by moving some of these classes into other existing packages (e.g. move SourceRecord to meas_base) - but in some cases that just moves the problem (afw.image and afw.detection also have both low-level and high-level components, so moving ExposureRecord there would just create new circular dependency problems).

            I hadn't planned on doing this soon, in the hopes we'd have a more global plan for package reorganization first, but we could consider moving it up.

            We could also address this by making afw.table-based persistence not rely on inheritance, since that's what really makes the Swig circular dependencies fatal (and this is at least partly a Python limitation, not just a Swig one, I think - two Python modules can't both import each other). I think that would be a lot more complex than what we have now, and I haven't really thought about alternative designs, though I know they're possible (since Boost.Serialiazation doesn't rely on inheritance). The problem is that it's very much a natural polymorphism problem, so to solve it without persistence essentially involves reimplementing a virtual function mechanism somehow.

            Show
            jbosch Jim Bosch added a comment - The problem is that some afw.table classes (e.g. SourceRecord) need access to other afw classes (like Footprint or Box), but some of those classes (like Footprint) need to inherit from afw.table.io.Persistable, or (like Box) they share a Swig module with a class (e.g. Polygon) that needs to be persistable. I think the right way to address this is to split afw.table into multiple libraries: a low-level library that just deals with POD data (enough to support persistence), and one or more higher-level library with specific subclasses like SourceRecord, AmpInfoRecord, and ExposureRecord, which need access to e.g. geometry classes. We could also do this by moving some of these classes into other existing packages (e.g. move SourceRecord to meas_base) - but in some cases that just moves the problem (afw.image and afw.detection also have both low-level and high-level components, so moving ExposureRecord there would just create new circular dependency problems). I hadn't planned on doing this soon, in the hopes we'd have a more global plan for package reorganization first, but we could consider moving it up. We could also address this by making afw.table-based persistence not rely on inheritance, since that's what really makes the Swig circular dependencies fatal (and this is at least partly a Python limitation, not just a Swig one, I think - two Python modules can't both import each other). I think that would be a lot more complex than what we have now, and I haven't really thought about alternative designs, though I know they're possible (since Boost.Serialiazation doesn't rely on inheritance). The problem is that it's very much a natural polymorphism problem, so to solve it without persistence essentially involves reimplementing a virtual function mechanism somehow.
            Hide
            rowen Russell Owen added a comment -

            I wonder if there is some way to keep the current design, but somehow hide the circular dependencies from Python. They don't cause trouble in C++ and that's where they are primarily needed.

            Show
            rowen Russell Owen added a comment - I wonder if there is some way to keep the current design, but somehow hide the circular dependencies from Python. They don't cause trouble in C++ and that's where they are primarily needed.
            Hide
            jbosch Jim Bosch added a comment -

            We could try putting #ifndef SWIG blocks around all the inheritance from afw::table::io::Persistable. I'm not eager to undertake all the Swig debugging that might ensue, but I can't immediately think of a reason why it wouldn't work.

            But I tried a number of ways to avoid moving Polygon out of geomLib.i back when we did this on the HSC side, and that certainly seems like one of the things I would have tried (it sounds vaguely familiar, but sadly I don't remember those experiments very well). The conversation on that HSC ticket may be illuminating: https://hsc-jira.astro.princeton.edu/jira/browse/HSC-972 (unfortunately it's also frustrating to read now, because I refer to a "big change" I tried there that didn't work, without actually saying what it was I tried).

            Show
            jbosch Jim Bosch added a comment - We could try putting #ifndef SWIG blocks around all the inheritance from afw::table::io::Persistable . I'm not eager to undertake all the Swig debugging that might ensue, but I can't immediately think of a reason why it wouldn't work. But I tried a number of ways to avoid moving Polygon out of geomLib.i back when we did this on the HSC side, and that certainly seems like one of the things I would have tried (it sounds vaguely familiar, but sadly I don't remember those experiments very well). The conversation on that HSC ticket may be illuminating: https://hsc-jira.astro.princeton.edu/jira/browse/HSC-972 (unfortunately it's also frustrating to read now, because I refer to a "big change" I tried there that didn't work, without actually saying what it was I tried).
            swinbank John Swinbank made changes -
            Team Data Release Production [ 10301 ]
            swinbank John Swinbank made changes -
            Rank Ranked lower
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 20458 ]
            Hide
            swinbank John Swinbank added a comment -

            Jim Bosch, Russell Owen — has this ticket been rendered obsolete by the pybind11 port? Certainly, I currently seem to be able to run import lsst.afw.geom without also getting afw.table.

            Show
            swinbank John Swinbank added a comment - Jim Bosch , Russell Owen — has this ticket been rendered obsolete by the pybind11 port? Certainly, I currently seem to be able to run import lsst.afw.geom without also getting afw.table .
            Hide
            jbosch Jim Bosch added a comment -

            The problem has certainly been mitigated, and the nature of it has changed.  I wouldn't consider it fully dealt with until we have switched to the new WrapperCollection-based approach in afw, but this ticket is no longer a useful statement of the problem.

            Show
            jbosch Jim Bosch added a comment - The problem has certainly been mitigated, and the nature of it has changed.  I wouldn't consider it fully dealt with until we have switched to the new WrapperCollection-based approach in afw, but this ticket is no longer a useful statement of the problem.
            Hide
            swinbank John Swinbank added a comment -

            Thanks Jim. Given that, let's close this “no longer useful statement” as invalid!

            Show
            swinbank John Swinbank added a comment - Thanks Jim. Given that, let's close this “no longer useful statement” as invalid!
            swinbank John Swinbank made changes -
            Resolution Done [ 10000 ]
            Status To Do [ 10001 ] Invalid [ 11005 ]

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              rowen Russell Owen
              Watchers:
              Jim Bosch, John Swinbank, Russell Owen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.