Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-812

A standard pattern for class attributes as interfaces

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Adopted
    • Resolution: Unresolved
    • Component/s: DM
    • Labels:
      None

      Description

      It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:

      class BaseAlgorithm(ABC):
          ConfigClass = None
          """Type of the configuration object for this algorithm.
       
          Must be set by all derived classes to a subclass of
          lsst.pex.config.Config.
          """

      The approach above is ubiquitous in our codebase (especially in Task), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

      A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:

      class BaseAlgorithm(ABC):
          ConfigClass: typing.Type[lsst.pex.config.Config]
          """Type of the configuration object for this algorithm.
       
          Must be set by all derived classes to a subclass of
          lsst.pex.config.Config.
          """

      This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the abc module an a classmethod, instead of an attribute:

      class BaseAlgorithm(ABC):
          @classmethod
          @abstractmethod
          def getConfigClass(cls):
              """Return the type of the configuration object for this algorithm."""
              raise NotImplementedError()

      This is my preferred solution to the problem, because:

      • It is the built-in approach to defining abstract interfaces in Python.
      • It is immediately obvious to any reader familiar with the abc module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
      • The abc module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an abstractmethod with an implementation that's called with super).
      • Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface without assuming particular implementations).
      • Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the abc module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
      • I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
      • No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

      I am deeply unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

      In the code-review debate that inspired this RFC, Nate Lust made a point that I am much more sympathetic to: that the instantiation-time checks done by abc are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

      In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves new metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract classmethod approach to be the proposal on the table, but I'm curious what others think.

        Attachments

          Issue Links

            Activity

            No builds found.
            jbosch Jim Bosch created issue -
            jbosch Jim Bosch made changes -
            Field Original Value New Value
            Description It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass = None
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            The approach above is ubiquitous in our codebase (especially in {{Task}}), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

            A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass: typing.Type[lsst.pex.config.Config]
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the {{abc}} module an a {{classmethod}}, instead of an attribute:
            {code:java}
            class BaseAlgorithm(ABC):
                @classmethod
                @abstractmethod
                def getConfigClass(cls):
                    raise NotImplementedError(){code}
            This is my preferred solution to the problem, because:
             * It is *the* built-in approach to defining abstract interfaces in Python.
             * It is immediately obvious to any reader familiar with the {{abc}} module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
             * The {{abc}} module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an {{abstractmethod}} with an implementation that's called with {{super}}).
             * Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface _without_ assuming particular implementations).
             * Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the {{abc}} module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
             * I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
             * No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

            I am _deeply_ unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

            The debate that inspired this RFC, [~nlust] made a point that I am much more sympathetic to: that the instantiation-time checks done by {{abc}} are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

            In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves _new_ metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract {{classmethod}} approach to be the proposal on the table, but I'm curious what others think.
            It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass = None
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            The approach above is ubiquitous in our codebase (especially in {{Task}}), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

            A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass: typing.Type[lsst.pex.config.Config]
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the {{abc}} module an a {{classmethod}}, instead of an attribute:
            {code:java}
            class BaseAlgorithm(ABC):
                @classmethod
                @abstractmethod
                def getConfigClass(cls):
                    """Return ype of the configuration object for this algorithm."""
                    raise NotImplementedError(){code}
            This is my preferred solution to the problem, because:
             * It is *the* built-in approach to defining abstract interfaces in Python.
             * It is immediately obvious to any reader familiar with the {{abc}} module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
             * The {{abc}} module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an {{abstractmethod}} with an implementation that's called with {{super}}).
             * Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface _without_ assuming particular implementations).
             * Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the {{abc}} module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
             * I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
             * No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

            I am _deeply_ unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

            In the code-review debate that inspired this RFC, [~nlust] made a point that I am much more sympathetic to: that the instantiation-time checks done by {{abc}} are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

            In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves _new_ metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract {{classmethod}} approach to be the proposal on the table, but I'm curious what others think.
            jbosch Jim Bosch made changes -
            Description It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass = None
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            The approach above is ubiquitous in our codebase (especially in {{Task}}), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

            A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass: typing.Type[lsst.pex.config.Config]
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the {{abc}} module an a {{classmethod}}, instead of an attribute:
            {code:java}
            class BaseAlgorithm(ABC):
                @classmethod
                @abstractmethod
                def getConfigClass(cls):
                    """Return ype of the configuration object for this algorithm."""
                    raise NotImplementedError(){code}
            This is my preferred solution to the problem, because:
             * It is *the* built-in approach to defining abstract interfaces in Python.
             * It is immediately obvious to any reader familiar with the {{abc}} module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
             * The {{abc}} module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an {{abstractmethod}} with an implementation that's called with {{super}}).
             * Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface _without_ assuming particular implementations).
             * Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the {{abc}} module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
             * I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
             * No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

            I am _deeply_ unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

            In the code-review debate that inspired this RFC, [~nlust] made a point that I am much more sympathetic to: that the instantiation-time checks done by {{abc}} are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

            In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves _new_ metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract {{classmethod}} approach to be the proposal on the table, but I'm curious what others think.
            It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass = None
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            The approach above is ubiquitous in our codebase (especially in {{Task}}), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

            A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass: typing.Type[lsst.pex.config.Config]
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the {{abc}} module an a {{classmethod}}, instead of an attribute:
            {code:java}
            class BaseAlgorithm(ABC):
                @classmethod
                @abstractmethod
                def getConfigClass(cls):
                    """Return the t ype of the configuration object for this algorithm."""
                    raise NotImplementedError(){code}
            This is my preferred solution to the problem, because:
             * It is *the* built-in approach to defining abstract interfaces in Python.
             * It is immediately obvious to any reader familiar with the {{abc}} module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
             * The {{abc}} module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an {{abstractmethod}} with an implementation that's called with {{super}}).
             * Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface _without_ assuming particular implementations).
             * Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the {{abc}} module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
             * I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
             * No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

            I am _deeply_ unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

            In the code-review debate that inspired this RFC, [~nlust] made a point that I am much more sympathetic to: that the instantiation-time checks done by {{abc}} are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

            In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves _new_ metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract {{classmethod}} approach to be the proposal on the table, but I'm curious what others think.
            jbosch Jim Bosch made changes -
            Description It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass = None
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            The approach above is ubiquitous in our codebase (especially in {{Task}}), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

            A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass: typing.Type[lsst.pex.config.Config]
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the {{abc}} module an a {{classmethod}}, instead of an attribute:
            {code:java}
            class BaseAlgorithm(ABC):
                @classmethod
                @abstractmethod
                def getConfigClass(cls):
                    """Return the t ype of the configuration object for this algorithm."""
                    raise NotImplementedError(){code}
            This is my preferred solution to the problem, because:
             * It is *the* built-in approach to defining abstract interfaces in Python.
             * It is immediately obvious to any reader familiar with the {{abc}} module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
             * The {{abc}} module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an {{abstractmethod}} with an implementation that's called with {{super}}).
             * Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface _without_ assuming particular implementations).
             * Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the {{abc}} module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
             * I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
             * No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

            I am _deeply_ unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

            In the code-review debate that inspired this RFC, [~nlust] made a point that I am much more sympathetic to: that the instantiation-time checks done by {{abc}} are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

            In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves _new_ metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract {{classmethod}} approach to be the proposal on the table, but I'm curious what others think.
            It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass = None
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            The approach above is ubiquitous in our codebase (especially in {{Task}}), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

            A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:
            {code:java}
            class BaseAlgorithm(ABC):
                ConfigClass: typing.Type[lsst.pex.config.Config]
                """Type of the configuration object for this algorithm.

                Must be set by all derived classes to a subclass of
                lsst.pex.config.Config.
                """{code}
            This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the {{abc}} module an a {{classmethod}}, instead of an attribute:
            {code:java}
            class BaseAlgorithm(ABC):
                @classmethod
                @abstractmethod
                def getConfigClass(cls):
                    """Return the type of the configuration object for this algorithm."""
                    raise NotImplementedError(){code}
            This is my preferred solution to the problem, because:
             * It is *the* built-in approach to defining abstract interfaces in Python.
             * It is immediately obvious to any reader familiar with the {{abc}} module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
             * The {{abc}} module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an {{abstractmethod}} with an implementation that's called with {{super}}).
             * Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface _without_ assuming particular implementations).
             * Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the {{abc}} module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
             * I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
             * No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

            I am _deeply_ unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

            In the code-review debate that inspired this RFC, [~nlust] made a point that I am much more sympathetic to: that the instantiation-time checks done by {{abc}} are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

            In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves _new_ metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract {{classmethod}} approach to be the proposal on the table, but I'm curious what others think.
            Hide
            fred3m Fred Moolekamp added a comment -

            +1 for preferring the "abstractmethod" solution. I also sympathize with Nate's argument and agree that it isn't foolproof, but I think that it is both better than what we have and a good solution that I have seen used effectively in other software.

            Show
            fred3m Fred Moolekamp added a comment - +1 for preferring the "abstractmethod" solution. I also sympathize with Nate's argument and agree that it isn't foolproof, but I think that it is both better than what we have and a good solution that I have seen used effectively in other software.
            tjenness Tim Jenness made changes -
            Remote Link This issue links to "Page (Confluence)" [ 31369 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            How about using @abstractmethod along with @property to get more of the syntax of the current code with the safety of the proposal?

            Show
            ktl Kian-Tat Lim added a comment - How about using @abstractmethod along with @property to get more of the syntax of the current code with the safety of the proposal?
            Hide
            jsick Jonathan Sick added a comment -

            I agree that abstractmethod and property are what I'd expect as a Python programmer; "get" methods aren't common for this application.

            Show
            jsick Jonathan Sick added a comment - I agree that abstractmethod and property are what I'd expect as a Python programmer; "get" methods aren't common for this application.
            Hide
            jbosch Jim Bosch added a comment -

            I should have clarified the property possibility up-front, because this isn't the first time it was brought up: property only works on instances, not types, and I'm specifically concerned here with attributes that are accessible on the type (as well as the instance). For instance attributes, I do think

            @property
            @abstractmethod
            def my_attribute(self):
                raise NotImplementedError()
            

            is a very good pattern.

            There are third-party implementations of a "class property" out there (including one in astropy, IIRC), but they all have some drawbacks, which is why you won't find one in the standard library. I don't know if any of them play nicely with ABC, but I personally don't consider adopting one to be a worthwhile trade for dropping parentheses.

            Show
            jbosch Jim Bosch added a comment - I should have clarified the property possibility up-front, because this isn't the first time it was brought up: property only works on instances , not types , and I'm specifically concerned here with attributes that are accessible on the type (as well as the instance). For instance attributes, I do think @property @abstractmethod def my_attribute(self): raise NotImplementedError() is a very good pattern. There are third-party implementations of a "class property" out there (including one in astropy, IIRC), but they all have some drawbacks, which is why you won't find one in the standard library. I don't know if any of them play nicely with ABC, but I personally don't consider adopting one to be a worthwhile trade for dropping parentheses.
            Hide
            nlust Nate Lust added a comment -

            I personally think this style (abstract class method) to get a simple attribute is:

            • Overly verbose and inefficient
            • distracting and harder to read on what is going on, and ensure you conform to some api
            • Problematic in that is is late error throwing, A problem with defining a class should raise when the class is defined

            I think it is a big hole in python right now that there is no standard lib way of doing this, but I don't think there is any pressure to change it because there ARE ways to get the desirable behavior, and any project is free to use the language features to implement it. What I am proposing can be accomplished with a simple class decorator. Users would do something like:

            from lsst.utils import checkAbstractAttributes, AbstractAttribute
             
            @checkAbstractAttributes
            class Base:
                ATTRIBUTE = AbstractAttribute
             
            #Attempting to do this will cause some sort of exception (Runtime or something else we choose)
            class Test(Base):
                pass

            Moreover if we chose to, we could also have it raise if an `abstractmethod` is present at declaration time if we chose to, thus avoiding errors that might happen in a completely different file.

            Also, if it is desirable we could even make it a standalone script and put into PyPi, if we wanted it to not be locked up in the stack.

            Show
            nlust Nate Lust added a comment - I personally think this style (abstract class method) to get a simple attribute is: Overly verbose and inefficient distracting and harder to read on what is going on, and ensure you conform to some api Problematic in that is is late error throwing, A problem with defining a class should raise when the class is defined I think it is a big hole in python right now that there is no standard lib way of doing this, but I don't think there is any pressure to change it because there ARE ways to get the desirable behavior, and any project is free to use the language features to implement it. What I am proposing can be accomplished with a simple class decorator. Users would do something like: from lsst.utils import checkAbstractAttributes, AbstractAttribute   @checkAbstractAttributes class Base: ATTRIBUTE = AbstractAttribute   #Attempting to do this will cause some sort of exception (Runtime or something else we choose) class Test(Base): pass Moreover if we chose to, we could also have it raise if an `abstractmethod` is present at declaration time if we chose to, thus avoiding errors that might happen in a completely different file. Also, if it is desirable we could even make it a standalone script and put into PyPi, if we wanted it to not be locked up in the stack.
            Hide
            Parejkoj John Parejko added a comment -

            I too was going to suggest an abstract method and a property, but as you say that doesn't work on class attributes. Having a `get` for a single value like this seems odd and unnatural (in python!) to me. Sadly, dataclasses won't work for this either. All of the other approaches I've done that I can think of also involved instance attributes.

            I do agree that adding a "use property+abstractmethod for instance attributes in abstract base classes" is worth adding to the dev guide, independent of the rest of this RFC.

            One advantage to the `get` member style proposed here is that the documentation for that attribute does get included in the output docs, whereas it doesn't for the typical "write a docstring under the attribute declaration" style, which raises this approach a little bit in my view.

            What are the drawbacks of the astropy "class property"? I wasn't aware of it.

            Nate's proposal requires us to write and support the code he describes; if that's more than an hour or two of work, I'm not sure it's worth while. Although, if it is good and we can get it lifted into pip, maybe that's a service. Are there any similar things already broadly available?

            Show
            Parejkoj John Parejko added a comment - I too was going to suggest an abstract method and a property, but as you say that doesn't work on class attributes. Having a `get` for a single value like this seems odd and unnatural (in python!) to me. Sadly, dataclasses won't work for this either. All of the other approaches I've done that I can think of also involved instance attributes. I do agree that adding a "use property+abstractmethod for instance attributes in abstract base classes" is worth adding to the dev guide, independent of the rest of this RFC. One advantage to the `get` member style proposed here is that the documentation for that attribute does get included in the output docs, whereas it doesn't for the typical "write a docstring under the attribute declaration" style, which raises this approach a little bit in my view. What are the drawbacks of the astropy "class property"? I wasn't aware of it. Nate's proposal requires us to write and support the code he describes; if that's more than an hour or two of work, I'm not sure it's worth while. Although, if it is good and we can get it lifted into pip, maybe that's a service. Are there any similar things already broadly available?
            Hide
            nlust Nate Lust added a comment -

            I can do a search to see if there is anything more broadly available, but I already wrote the other code, its about 15 lines, and took 5-10 minutes.

            Show
            nlust Nate Lust added a comment - I can do a search to see if there is anything more broadly available, but I already wrote the other code, its about 15 lines, and took 5-10 minutes.
            Hide
            jbosch Jim Bosch added a comment -

            What are the drawbacks of the astropy "class property"? I wasn't aware of it.

            I can't speak to astropy's implementation specifically; the general problem is that if you define a descriptor that implements dunder-get (not gonna fight your underscore-handling today, Jira!) to return something other than self when it is given the type, rather than an instance, it's really hard to get access to the descriptor object anymore, and that plays havoc with at least docs:

            class ExampleClass:
                @some_class_property
                def my_class_property(cls):
                    return 4
             
            print(ExampleClass.my_class_property.__doc__)  # docs for int!

            I don't recall if it's worse than that, and it's even possible that it's actually better than that, because there might be tools clever enough to do something like 

            ExampleClass.__dict__["my_class_property"].__docs__

            to grab docs or do other things with the descriptor itself; this is all just from my memories of looking around a while back for solutions and finding that the language boxes you in pretty thoroughly, and thinking that it wasn't worth it for just some mild syntactic sugar.

            Which, in the end, is how I feel about metaprogramming solutions to this RFC's more immediate problem as well, especially because the 5-10 minutes it takes to code up the first draft of a bit of metaprogramming is usually a small fraction of the total cost of ownership for such things.

            Show
            jbosch Jim Bosch added a comment - What are the drawbacks of the astropy "class property"? I wasn't aware of it. I can't speak to astropy's implementation specifically; the general problem is that if you define a descriptor that implements dunder-get (not gonna fight your underscore-handling today, Jira!) to return something other than self when it is given the type, rather than an instance, it's really hard to get access to the descriptor object anymore, and that plays havoc with at least docs: class ExampleClass: @some_class_property def my_class_property(cls): return 4   print(ExampleClass.my_class_property.__doc__) # docs for int ! I don't recall if it's worse than that, and it's even possible that it's actually better than that, because there might be tools clever enough to do something like  ExampleClass.__dict__[ "my_class_property" ].__docs__ to grab docs or do other things with the descriptor itself; this is all just from my memories of looking around a while back for solutions and finding that the language boxes you in pretty thoroughly, and thinking that it wasn't worth it for just some mild syntactic sugar. Which, in the end, is how I feel about metaprogramming solutions to this RFC's more immediate problem as well, especially because the 5-10 minutes it takes to code up the first draft of a bit of metaprogramming is usually a small fraction of the total cost of ownership for such things.
            Hide
            ktl Kian-Tat Lim added a comment -

            Isn't raising an exception for a class definition actually a problem because it doesn't allow progressive concretizing in a subclass hierarchy but instead forces only concrete subclasses immediately?

            class Abstract(ABC):
                @abstractmethod
                @classmethod
                def foo(cls):
                    raise NotImplementedError()
             
                @abstractmethod
                def bar(self):
                    raise NotImplementedError()
             
            class LessAbstract(Abstract):
                @classmethod
                def foo(cls):
                    return 1
             
            class Concrete(LessAbstract):
                def bar(self):
                    return 2

            I would argue that instantiation is exactly when a concreteness failure should be recognized – although that could be via static checker and not at runtime.

            Show
            ktl Kian-Tat Lim added a comment - Isn't raising an exception for a class definition actually a problem because it doesn't allow progressive concretizing in a subclass hierarchy but instead forces only concrete subclasses immediately? class Abstract(ABC): @abstractmethod @classmethod def foo(cls): raise NotImplementedError() @abstractmethod def bar(self): raise NotImplementedError()   class LessAbstract(Abstract): @classmethod def foo(cls): return 1   class Concrete(LessAbstract): def bar(self): return 2 I would argue that instantiation is exactly when a concreteness failure should be recognized – although that could be via static checker and not at runtime.
            Hide
            jbosch Jim Bosch added a comment -

            I would argue that instantiation is exactly when a concreteness failure should be recognized

            , and if this wasn't Jira, also :100: and :facepalm:.

            Show
            jbosch Jim Bosch added a comment - I would argue that instantiation is exactly when a concreteness failure should be recognized , and if this wasn't Jira, also :100: and :facepalm:.
            Hide
            nlust Nate Lust added a comment -

            Kian-Tat Lim you are exactly correct. As much as I dislike the style of building up interfaces, I can understand people using it. And as much as I would like import time to be like compile time, its really not. So without some kind of finalize marker, import time just cant offer the kind of protection against this unless we disallow this pattern.

            Not quite on topic, but this has convinced me even more that trying to do relations through abstract type inheritance just does not map well to python in a way that is 'pythonic' and that we should really just lean into mypy, typing, and protocols more (with various runtime checking in addition to import time).

            Show
            nlust Nate Lust added a comment - Kian-Tat Lim you are exactly correct. As much as I dislike the style of building up interfaces, I can understand people using it. And as much as I would like import time to be like compile time, its really not. So without some kind of finalize marker, import time just cant offer the kind of protection against this unless we disallow this pattern. Not quite on topic, but this has convinced me even more that trying to do relations through abstract type inheritance just does not map well to python in a way that is 'pythonic' and that we should really just lean into mypy, typing, and protocols more (with various runtime checking in addition to import time).
            Hide
            jbosch Jim Bosch added a comment -

            A number of people have brought up regular (instance) abstract properties here, and a Slack discussion made me realize that I was probably too quick to dismiss those as not relevant for this RFC: there are occasions where we do need to define an interface on the type, and for those we can't use a regular property and that's what I had intended to focus on.  But we often define an interface in terms of class attributes when we don't need to, because we don't actually need to get the values from a type object.  So that's probably the first thing any dev-guide text that comes from this RFC should say: if you find yourself wanting a derived class to provide an attribute to satisfy some interface, first ask whether the user of that class needs to access it from the type, and if they don't, it should be a regular base class instance attribute, an abstract regular property, or a regular abstract method.

            Show
            jbosch Jim Bosch added a comment - A number of people have brought up regular (instance) abstract properties here, and a Slack discussion made me realize that I was probably too quick to dismiss those as not relevant for this RFC: there are occasions where we do need to define an interface on the type, and for those we can't use a regular property and that's what I had intended to focus on.  But we often define an interface in terms of class attributes when we don't need to , because we don't actually need to get the values from a type object.  So that's probably the first thing any dev-guide text that comes from this RFC should say: if you find yourself wanting a derived class to provide an attribute to satisfy some interface, first ask whether the user of that class needs to access it from the type, and if they don't, it should be a regular base class instance attribute, an abstract regular property, or a regular abstract method.
            tjenness Tim Jenness made changes -
            Remote Link This issue links to "Page (Confluence)" [ 31433 ]
            tjenness Tim Jenness made changes -
            Remote Link This issue links to "Page (Confluence)" [ 31541 ]
            tjenness Tim Jenness made changes -
            Remote Link This issue links to "Page (Confluence)" [ 31655 ]
            jbosch Jim Bosch made changes -
            Link This issue is triggering DM-33054 [ DM-33054 ]
            jbosch Jim Bosch made changes -
            Status Proposed [ 10805 ] Adopted [ 10806 ]
            Hide
            jbosch Jim Bosch added a comment -

            It's been a while since my last summarization post, with no comments since, so I think this is ready to adopt: to take into account previous comments, I'll recommend regular abstract methods in all cases where the user of an interface only works on instances, not subclasses, with classmethods only when needed.

            Show
            jbosch Jim Bosch added a comment - It's been a while since my last summarization post, with no comments since, so I think this is ready to adopt: to take into account previous comments, I'll recommend regular abstract methods in all cases where the user of an interface only works on instances, not subclasses, with classmethods only when needed.
            Hide
            jbosch Jim Bosch added a comment -

            I should add that I'm not going to jump into implementing this immediately, as it's the holidays and I want to give watchers a chance to yell if they disagree; writing an implementation ticket and adopting this was just something I wanted to cross off my to-do list.

            Show
            jbosch Jim Bosch added a comment - I should add that I'm not going to jump into implementing this immediately, as it's the holidays and I want to give watchers a chance to yell if they disagree; writing an implementation ticket and adopting this was just something I wanted to cross off my to-do list.
            Hide
            Parejkoj John Parejko added a comment -

            Please mention in your dev guide update a note like your comment from November: most use cases should just be instance attributes.

            Show
            Parejkoj John Parejko added a comment - Please mention in your dev guide update a note like your comment from November: most use cases should just be instance attributes.
            Hide
            jbosch Jim Bosch added a comment -

            To be clear, I actually think most of these cases are best dealt with via an abstract instance method or property, and a base class that demands its subclasses define an instance attribute is something I would consider an antipattern. (A base class that defines its own instance attribute that isn't expected to be overridden is fine, too, but isn't really a customization hook.)

            Show
            jbosch Jim Bosch added a comment - To be clear, I actually think most of these cases are best dealt with via an abstract instance method or property, and a base class that demands its subclasses define an instance attribute is something I would consider an antipattern. (A base class that defines its own instance attribute that isn't expected to be overridden is fine, too, but isn't really a customization hook.)
            tjenness Tim Jenness made changes -
            Remote Link This issue links to "Page (Confluence)" [ 31899 ]

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              jbosch Jim Bosch
              Watchers:
              Fred Moolekamp, Jim Bosch, John Parejko, Jonathan Sick, Kian-Tat Lim, Nate Lust, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Planned End:

                  Jenkins

                  No builds found.