# A standard pattern for class attributes as interfaces

XMLWordPrintable

#### Details

• Type: RFC
• Resolution: Unresolved
• Component/s:
• Labels:
None

#### Description

It is a relatively common problem in Python to define a base-class interface that demands that all derived classes provide a class attribute with a particular name, e.g.:

 class BaseAlgorithm(ABC):  ConfigClass = None  """Type of the configuration object for this algorithm.    Must be set by all derived classes to a subclass of  lsst.pex.config.Config.  """

The approach above is ubiquitous in our codebase (especially in Task), but I regard it as a serious antipattern, because it is all too easy for derived classes to ignore this instruction, especially if the base class initializes the attribute to a value of the appropriate type as a sort of attempt at documentation (rather than a viable default), yielding silently incorrect behavior (I have seen this in the wild, in our codebase, with the problem going unnoticed for months).

A slightly better approach is to leave the attribute undefined in the base class, which can be done by using a type annotation instead:

 class BaseAlgorithm(ABC):  ConfigClass: typing.Type[lsst.pex.config.Config]  """Type of the configuration object for this algorithm.    Must be set by all derived classes to a subclass of  lsst.pex.config.Config.  """

This will cause any attempt to access the attribute to fail, as it should, and the type annotation clarifies the type on its own.  But it'd be nice if we could get a failure earlier.  We can move it up to the point that the derived class is constructed by using the abc module an a classmethod, instead of an attribute:

 class BaseAlgorithm(ABC):  @classmethod  @abstractmethod  def getConfigClass(cls):  """Return the type of the configuration object for this algorithm."""  raise NotImplementedError()

This is my preferred solution to the problem, because:

• It is the built-in approach to defining abstract interfaces in Python.
• It is immediately obvious to any reader familiar with the abc module (and all of our developers should be, I think) what is expected of any derived class, without any documentation.  (I am not by any means suggesting that documentation could or should be skipped in this case - but it is all too easy for the author of a class to unknowingly provide incomplete or vague documentation, especially for derived-class implementers, and in-code declaration of an abstract method is a strong guard against that.)
• The abc module provides features for edge cases like, "I want to provide a default, but want to require derived classes to opt-in" (i.e. an abstractmethod with an implementation that's called with super).
• Being able to change the implementation to calculate something rather than return a constant is more often a feature than a bug (especially because we are talking about defining an interface without assuming particular implementations).
• Static analysis tools (at least MyPy; maybe some linters, too?) are aware of the abc module and will catch cases where such a method has been forgotten but code attempts to construct the derived class instance anyway.
• I believe an instantiation-time check (when static checking is possible) is enough to catch violations early enough, in practice.
• No development or maintenance of a custom solution (which would have to involve some amount of metaprogramming) is needed.

I am deeply unsympathetic to the argument that we can't tolerate an extra pair of parenthesis to make this a method call.

In the code-review debate that inspired this RFC, Nate Lust made a point that I am much more sympathetic to: that the instantiation-time checks done by abc are both later than we'd like and incomplete (they just check that a derived class attribute exists, not the degree to which it is compatible with the base class).  I don't think this is enough of a problem that it merits ignoring the built-in solution in favor of providing our own, and I think I'd also vote for the undefined-but-type-annotated approach over any custom metaprogramming I can think of (but we'll see if we get anything more concrete).

In any case, I think we do need to write something down in the dev guide (and provide utility code, if that's what we decide to recommend), because this is a common problem, we've got (IMO) a bad solution out in the wild and likely to be cargo-culted, and we shouldn't have developers inventing their own solutions every time they come across it (especially if it involves new metaprogramming each time).  Since I wrote this RFC, I'm considering the abstract classmethod approach to be the proposal on the table, but I'm curious what others think.

#### Activity

Hide
Jim Bosch added a comment -

A number of people have brought up regular (instance) abstract properties here, and a Slack discussion made me realize that I was probably too quick to dismiss those as not relevant for this RFC: there are occasions where we do need to define an interface on the type, and for those we can't use a regular property and that's what I had intended to focus on.  But we often define an interface in terms of class attributes when we don't need to, because we don't actually need to get the values from a type object.  So that's probably the first thing any dev-guide text that comes from this RFC should say: if you find yourself wanting a derived class to provide an attribute to satisfy some interface, first ask whether the user of that class needs to access it from the type, and if they don't, it should be a regular base class instance attribute, an abstract regular property, or a regular abstract method.

Show
Jim Bosch added a comment - A number of people have brought up regular (instance) abstract properties here, and a Slack discussion made me realize that I was probably too quick to dismiss those as not relevant for this RFC: there are occasions where we do need to define an interface on the type, and for those we can't use a regular property and that's what I had intended to focus on.  But we often define an interface in terms of class attributes when we don't need to , because we don't actually need to get the values from a type object.  So that's probably the first thing any dev-guide text that comes from this RFC should say: if you find yourself wanting a derived class to provide an attribute to satisfy some interface, first ask whether the user of that class needs to access it from the type, and if they don't, it should be a regular base class instance attribute, an abstract regular property, or a regular abstract method.
Hide
Jim Bosch added a comment -

It's been a while since my last summarization post, with no comments since, so I think this is ready to adopt: to take into account previous comments, I'll recommend regular abstract methods in all cases where the user of an interface only works on instances, not subclasses, with classmethods only when needed.

Show
Jim Bosch added a comment - It's been a while since my last summarization post, with no comments since, so I think this is ready to adopt: to take into account previous comments, I'll recommend regular abstract methods in all cases where the user of an interface only works on instances, not subclasses, with classmethods only when needed.
Hide
Jim Bosch added a comment -

I should add that I'm not going to jump into implementing this immediately, as it's the holidays and I want to give watchers a chance to yell if they disagree; writing an implementation ticket and adopting this was just something I wanted to cross off my to-do list.

Show
Jim Bosch added a comment - I should add that I'm not going to jump into implementing this immediately, as it's the holidays and I want to give watchers a chance to yell if they disagree; writing an implementation ticket and adopting this was just something I wanted to cross off my to-do list.
Hide
John Parejko added a comment -

Please mention in your dev guide update a note like your comment from November: most use cases should just be instance attributes.

Show
John Parejko added a comment - Please mention in your dev guide update a note like your comment from November: most use cases should just be instance attributes.
Hide
Jim Bosch added a comment -

To be clear, I actually think most of these cases are best dealt with via an abstract instance method or property, and a base class that demands its subclasses define an instance attribute is something I would consider an antipattern. (A base class that defines its own instance attribute that isn't expected to be overridden is fine, too, but isn't really a customization hook.)

Show
Jim Bosch added a comment - To be clear, I actually think most of these cases are best dealt with via an abstract instance method or property, and a base class that demands its subclasses define an instance attribute is something I would consider an antipattern. (A base class that defines its own instance attribute that isn't expected to be overridden is fine, too, but isn't really a customization hook.)

#### People

Assignee:
Jim Bosch
Reporter:
Jim Bosch
Watchers:
Fred Moolekamp, Jim Bosch, John Parejko, Jonathan Sick, Kian-Tat Lim, Nate Lust, Tim Jenness