Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-270

Decorators for adding Python code to wrapped C++ classes

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None
    • Location:
      this issue page

      Description

      As it's implemented in C++, pybind11 doesn't have an equivalent to Swig's %pythoncode blocks, and that means any Python code we add to wrapped classes needs to go in a regular Python file. This is straightforward to do, but verbose and a bit hard to read, especially if you want to avoid introducing unnecessary symbols:

      from ._wcs import Wcs
       
      def wcsStr(self):
          return "Wcs(%g, %g)" % tuple(self.getPixelOrigin())
      Wcs.__str__ = wcsStr
      

      We had a brief and anticonvergent discussion about how to make this work better in the #dm-pybind11-porting Slack channel a few days ago, and I've since taken a bit of time to explore the options and come up with my own preferred proposal.

      I propose we add a class decorator that effectively re-opens wrapped classes, allowing us to add new methods to them using the usual Python class definition syntax:

      from lsst.utils import extend
      from ._wcs import Wcs
       
      @extend
      class Wcs:
          def __str__(self):
              return "Wcs(%g, %g)" % tuple(self.getPixelOrigin())
      

      I'll admit the implementation of this decorator is going to be a bit tricky: it can't actually reopen the class definition, so it actually iterates over the decorated class's attributes and adds them to the existing class with the same name, while making sure not to copy special attributes that are added implicitly to all classes (e.g. __module__). But that implementation is still at most about 10 lines of code in a module most devs will never have to touch, and I think the benefits are quite obvious: a large fraction of our wrapper code will be vastly more readable.

      The big downside of this decorator is that it doesn't allow the type to be extended to be computed. That's primarily a problem with wrappers for C++ template instantiations, however, and I have a different proposal for how to add Python extensions (among other things) to those on DM-8805. For other cases, I think we should also add a function decorator that would allow the following syntax:

      from lsst.utils import inClass
      from .wcs import Wcs
       
      @inClass(Wcs)
      def __str__(self):
          return "Wcs(%g, %g)" % tuple(self.getPixelOrigin())
      

      This is just as compact as the class decorator when only a single attribute is added, and it's quite simple to implement. But the readability is much worse when multiple attributes are involved: there's no common indentation block, and the class name is repeated and is much less prominent. Like the decorator-free approach, it also adds unwanted module-level symbols as side effect, though these can be ignored by using __all__ in most contexts we'd use the decorator.

      I'd rather not have a different policy just for adding single attributes; we'll all too often want to add additional attributes and have to reformat what was done before. So I think our pybind11 coding conventions should suggest the function decorator be used only in the (rare) cases where the class decorator would not work because the to-be-extended type must be computed and it is not a wrapped template class suitable for the DM-8805 solution.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment - - edited

            I don't think we have ever passed a Python object back to C++. I would think this is a fairly borderline case?

            I'm afraid it's extremely common. Any time you pass a C++ object to a C++ method/function/constructor in Python code (e.g. pass a Box2I to an ImageF constructor), you're passing a C++ object to Python. That's the case that works in your proposal, though, because the pybind11 converters would recognize the Box2I as an instance of _Box2I, which it knows about.

            What wouldn't work is the opposite, such as ImageF.getBBox() - pybind11 is doing that conversion on its own, and we don't have any way to have it return a Python Box2I instead of the _Box2I it knows about.

            Show
            jbosch Jim Bosch added a comment - - edited I don't think we have ever passed a Python object back to C++. I would think this is a fairly borderline case? I'm afraid it's extremely common. Any time you pass a C++ object to a C++ method/function/constructor in Python code (e.g. pass a Box2I to an ImageF constructor), you're passing a C++ object to Python. That's the case that works in your proposal, though, because the pybind11 converters would recognize the Box2I as an instance of _Box2I , which it knows about. What wouldn't work is the opposite, such as ImageF.getBBox() - pybind11 is doing that conversion on its own, and we don't have any way to have it return a Python Box2I instead of the _Box2I it knows about.
            Hide
            nlust Nate Lust added a comment -

            I am strongly in favor of a class level decorator to add additional code to a class. I agree with the general consensus that the name should be informative and long, but the added formatting standards that defining a class gives makes the code much more readable. I am not opposed to a function level decorator, but it should be used as minimally as possible. I really fear having isolated functions defined all over the place that add to one class, making it very difficult to get an over view on what the class does. Having one c++ definition, and one python extension will make code so much easier to get up to speed with when you are learning something new.

            I much prefer this approach as opposed to adding functionality and filtering out with _all_. Filtering out relies on a user understanding the same import time behavior that decorators use, and additionally magic dunder methods and why they are called. Additionally the code modification (things being left out or modified) does not happen local to the code itself, so it might not be immediately obvious to someone why functions are not getting imported, what modified code, or why new code they wrote in a long file is not showing up. This is similar to our principal of defining variables close to where they are being used.

            I don't think there is a problem using decorators in regards to general programmer familiarity. Decorators are common enough in python that people generally understand what they are seeing, or at least have an easy time looking up what they are seeing (googling what @ means). Programmers are also generally comfortable using decorators even when they are not sure how they work, ( how many people know how @staticmethod and @classmethod work without looking up the code ).

            Show
            nlust Nate Lust added a comment - I am strongly in favor of a class level decorator to add additional code to a class. I agree with the general consensus that the name should be informative and long, but the added formatting standards that defining a class gives makes the code much more readable. I am not opposed to a function level decorator, but it should be used as minimally as possible. I really fear having isolated functions defined all over the place that add to one class, making it very difficult to get an over view on what the class does. Having one c++ definition, and one python extension will make code so much easier to get up to speed with when you are learning something new. I much prefer this approach as opposed to adding functionality and filtering out with _ all _. Filtering out relies on a user understanding the same import time behavior that decorators use, and additionally magic dunder methods and why they are called. Additionally the code modification (things being left out or modified) does not happen local to the code itself, so it might not be immediately obvious to someone why functions are not getting imported, what modified code, or why new code they wrote in a long file is not showing up. This is similar to our principal of defining variables close to where they are being used. I don't think there is a problem using decorators in regards to general programmer familiarity. Decorators are common enough in python that people generally understand what they are seeing, or at least have an easy time looking up what they are seeing (googling what @ means). Programmers are also generally comfortable using decorators even when they are not sure how they work, ( how many people know how @staticmethod and @classmethod work without looking up the code ).
            Hide
            jbosch Jim Bosch added a comment -

            I'm accepting this with the name of the class decorator changed to continueClass, which seems to be the most popular option.

            I'm currently marking the implementation ticket as DM-8805, since there's a WIP implementation there already; I'll update the ticket description shortly to indicate the change in scope.

            Show
            jbosch Jim Bosch added a comment - I'm accepting this with the name of the class decorator changed to continueClass , which seems to be the most popular option. I'm currently marking the implementation ticket as DM-8805 , since there's a WIP implementation there already; I'll update the ticket description shortly to indicate the change in scope.
            Hide
            tjenness Tim Jenness added a comment -

            Can you please switch round the RFC/DM relationship? RFCs trigger work, that's how we know the work has been done and the RFC can be marked Implemented. A triggering ticket doesn't have that relationship.

            Show
            tjenness Tim Jenness added a comment - Can you please switch round the RFC/DM relationship? RFCs trigger work, that's how we know the work has been done and the RFC can be marked Implemented. A triggering ticket doesn't have that relationship.
            Hide
            tjenness Tim Jenness added a comment -

            Jim Bosch is this RFC now complete? There is no more work pending.

            Show
            tjenness Tim Jenness added a comment - Jim Bosch is this RFC now complete? There is no more work pending.

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              jbosch Jim Bosch
              Watchers:
              Jim Bosch, John Parejko, John Swinbank, Krzysztof Findeisen, Nate Lust, Pim Schellart [X] (Inactive), Russell Owen, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.