Hacker News new | past | comments | ask | show | jobs | submit login
Wield Python's super() like a Jedi (rhettinger.wordpress.com)
141 points by raymondh on May 26, 2011 | hide | past | favorite | 63 comments



It seems to me that code that uses super() in these creative ways will be very difficult to maintain. You need to understand the subtleties of the MRO just to find out which method is being called. Pitty the poor Python programmer who stumbles into such code without being aware that super() may not call the class's direct base.

I suppose similar criticism can be leveled against many other dynamic techniques, except that this one sacrifices more readability than I'm comfortable with.


super() supports arguments which allow you to specify which class's implementation to lookup, which is an explicit way to resolve ambiguities. Although well-designed multiple-inheritance should try to avoid those ambiguities in the first place -- a class shouldn't inherit from multiple bases with conflicting implementations of a method.

I think in a case like the one you describe, where the direct ancestor's implementation is circumvented to access a different ancestor's implementation, it should be done by calling

  super(Ancestor).someMethod()
That should improve understanding.


I disagree, that's extremely fragile. It might work for the particular class you are implementing. However, it might cause subclasses of that class to break, since those subclasses will have a different MRO than the current class, and you could end up skipping more of the MRO than you intended.


Personally I'm not in favor of enforcing such strict conventions, especially in a dynamic language like Python. Also isn't this pattern he repeats throughout:

    class Shape:
        def __init__(self, **kwds):
            self.shapename = kwds.pop('shapename')
            super().__init__(**kwds)
Better written like this?

    class Shape:
        def __init__(self, shapename=None, **kwds):
            self.shapename = shapename
            super().__init__(**kwds)
It seems like there are some things about Python's which the author doesn't like, and he solves them by adding a lot of boiler plate to each of his classes. It's probably a better idea to just embrace the Python way or use a different language.


These two aren't equivalent, the first example requires the shapename keyword arg, raising a KeyError when it wasn't specified


That's true. The second one might be better written as:

  def __init__(self, shapename, **kwds):
Or, in Python 3:

  def __init__(self, *, shapename, **kwds):
(making shapename a required keyword-only argument)


  def __init__(self, shapename, **kwds):
is not better written; it allows shapename to be passed positionally which makes cooperative inheritance fragile.


The version that I was rewriting:

  def __init__(self, shapename=None, **kwds):
also allows that.


Haha, the author is one of the main contributors to python, not some high-schooler.

  kwds.pop('shapename', None) 
Will get you what you're looking for.


The issue with your second version is

  In [1]: def init(shapename=None, **kwargs):
     ...:     pass
     ...: 

  In [2]: init(shapename='circle', **{'shapename': 'circle'})
  ---------------------------------------------------------------------------
  TypeError                                 Traceback (most recent call last)
  
  /home/teh/Desktop/<ipython console> in <module>()

  TypeError: init() got multiple values for keyword argument 'shapename'


That's entirely the caller's fault, not the function's. Compare:

  >>> def init(**kwargs):
  ...   pass
  ...
  >>> init(shapename='circle', **{'shapename': 'circle'})
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: init() got multiple values for keyword argument 'shapename'


It seems to me that the example of combining built-in dictionary classes is naively optimistic. For starters, OrderedDict, as it happens, does not use super! It calls the dict super-class methods directly. Since dict happens to be the next class in the MRO, this doesn't really matter for the purpose of this example, but I can envision a scenario where some plucky programmer inherits from both OrderedCounter and some other dict subclass, and the result doesn't work because OrderedDict does the wrong thing.

And OrderedDict isn't the only one. Maybe for some reason I would like to have an OrderedCounter where all the counts default to 42. So I do this:

  class DefaultOrderedCounter(defaultdict, OrderedCounter):
      pass
  doc = DefaultOrderedCounter(lambda: 42)
  doc.update('abracadabra')
Which results in:

  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "c:\python32\lib\collections.py", line 507, in update
      _count_elements(self, iterable)
    File "c:\python32\lib\collections.py", line 63, in __setitem__
      self.__map[key] = link = Link()
  AttributeError: 'DefaultOrderedCounter' object has no attribute '_OrderedDict__map'
Whoops! Apparently defaultdict doesn't use super either. Of course a better way to do this would be to subclass DefaultOrderedCounter and just override the __missing__ method by hand, but that's not the point.

The article goes into "How to Incorporate a Non-cooperative Class", which basically says "wrap it up in a proxy class". But that's not really going to work here, since the result would be two separate dicts, with the defaultdictwrapper methods operating on one dict, and the other methods operating on the other.


Okay, I give. What's the markup for posting code snippets on this site?


Put two spaces at the beginning of a line for that line to be displayed monospaced.

  Example


why would you have to subclass DefaultOrderedCounter again in order to add in the missing method?

Though I agree there's something annoying going on there with the multiple inheritence.


I mistyped. I meant "subclass OrderedCounter to add in the __missing__ method".


I didn't see it linked in the article nor mentioned here already but several of the problems with `super` have been discussed in "Python's Super is nifty, but you can't use it"[1] for quite some time.

[1]: http://fuhm.net/super-harmful/


That article is about Python 2's super, which is a different beast. Lessons were learnt in the design of Python 3's version.


Not really. The only thing that changed about super in Python 3 is that you can call it with no arguments, which is just a short-hand for the way it was already used in Python 2. Everything in that article applies equally to both Python versions.


Nice article, but one thing struck me when reading the code examples posted in the article: What's the use of using kwds in __init__?

    class Shape(Root):
        def __init__(self, **kwds):
            self.shapename = kwds.pop('shapename')
            super().__init__(**kwds)

    class ColoredShape(Shape):
        def __init__(self, **kwds):
            self.color = kwds.pop('color')
            super().__init__(**kwds)
I would personally just have written this somewhat similar to this:

    class Shape(Root):
        def __init__(self, shapename):
            self.shapename = shapename
            super().__init__()

    class ColoredShape(Shape):
        def __init__(self, shapename, color):
            self.shapename = shapename
            self.color = color
            super().__init__(shapename)
But then again I'm by no means what anyone could call a Python 'Jedi'. Can anyone explain to me why using kwds in this way would be better? Would it not lead to having to look up the declaration of the class for the exact names of arguments to pop() every time?


The reason for using keywords is that there is no guarantee that the next class in the MRO after ColoredShape will be Shape. When the instance is a ColoredShape, it will be, but if ColoredShape is subclassed with a more complex inheritance structure, then something else could end up in between them in the MRO.

Thus, ColoredShape.__init__ can't simply pass on shapename to the super method, because that might not be the argument expected by the next __init__ method in the MRO. Instead, each method needs to gracefully accept arbitrary arguments and then pass on whatever it received (minus what it consumed).


Am I missing some points here?

It seems that the examples are tightly coupled and favors more on inheritance than on composition.


A true Jedi favors composition.


Good advice and well spoken :-) That being said, composing a number of object classes also involves some measure of cooperation, caller/callee ordination, and some plan for calling the methods in the correct order.

IOW, if the decomposition of the classes is the same, then the solutions using composition or using multiple inheritance will have much in common.

Things should be as simple as possible, but no simpler :-)


One of the reasons I'm starting to love Go more than Python.

In Go something like super() is superfluous.


The only reason why you don't need super in Go is because you cannot embed multiple interfaces except if their sets of methods are disjoint. This eliminates super's job of allowing programmers to differentiate between base classes with conflicting method implementations.

Without this, Go requires you to design your class hierarchy in a way that is cleaner, but the super concept is still not superfluous -- because you do sometimes need to reference an ancestor's implementation in Go.

You embed an interface and then explicitly reference the embedded interface by name. The Go interface

  type ReaderWriter interface {
    Reader
    Writer
  }
is exactly like this Python

  class ReaderWriter(Reader, Writer):
    pass
To get at a particular superclass implementation, you pass the class to super, like so:

  def read_and_write(self):
    super(Reader).read()
    super(Writer).write()
which in Go is:

  func (rw *ReaderWriter) ReadAndWrite() {
    rw.Reader.read()
    rw.Writer.write()
  }

So it's not superfluous -- there's an equivalent syntax to super which involves referencing embedded interfaces.


Sigh. No, that is not at all how super works in Python. The equivalent code to that Go function would be:

  def read_and_write(self):
      Reader.read(self)
      Writer.write(self)
That is, if you want to get at a particular superclass implementation, then you just call it directly. super() is used when you're not calling a particular superclass, instead relying on Python to compute the correct "next method" for you.

"super(Reader).read()", if you actually try it, will just result in an AttributeError. This is because super(Reader) returns an unbound super object, not a bound super object that you can actually dispatch from. Why this advanced usage might be useful is beyond the scope of this thread, but if you're curious then just google for python unbound super and hit the Feeling Lucky button.


Yeah you're right, I was mistaken. super uses the MRO of the first argument, or inspects the stack if none is given.

BTW your sighing is not necessary.


I apologize. It was late, and my supply of patience for the day was exhausted.


The "kwds" technique is actually something I came up with independently to solve a different problem when I was using PyGame: ridiculously long argument lists when constructing sprites. In this case it significantly increased the flexibility and brevity of the code, although perhaps with a more complicated inheritance tree it wouldn't be worth it.

Nice to see this pattern being acknowledged somewhere else; I thought I was the only one that did that.


It's a great technique. I'm sure your PyGame code benefitted in a big way :-)


This is a particularly unnecessary title change.


how does the runtime performance of super() compare to the hardcoding method? (from your example, using dict.__setitem__ as opposed to super().__setitem__ )


In CPython, calling super() has about the same cost as calling any other builtin like len() or int(). The MRO itself is precomputed and stored in inst.__class__.__mro__.

Contrasting dict.__setitem__ with super().__setitem__, the latter adds one function call. In addition, both forms require a builtin lookup and an attribute lookup.

In PyPy, much of the overhead of builtin lookups, function calls, and attribute lookups is automatically optimized away.


Unfortunately on PyPy super is less efficient than refering directly to the parent classes method, however it's fixable (and will be fixed).


Do you have some sort of timing measure?


No, I haven't timed it myself, I just know how it's implemented and what code we're generating internally.

Edit: To be clear I would not advocate not using super() on account of this, as I said it'll eventually be fixed.


I don't think it suffers much as the method resolution order (mro) is computed ahead-of-time when creating the class.


so super() is computed ahead of time? That doesn't make sense when you look at his comment:

"The calculation depends on both the class where super is called and on the instance’s tree of ancestors."

I agree that the class's MRO can be computed ahead of time, but theoretically the instance's order could be different (if a class method changed the definition of __setitem__)


The method resolution order for instances of the same class is always the same for all methods. If the class doesn't implement that method, the interpreter just skips and goes to the next place in the call chain. All super does is call the next method in the chain. So if you have class A(object), class B(A), class C(A), and class D(B, C), the MRO of D's instances will be B C A object. Each time an instance of D calls super, it will call the next method in the chain, not necessarily B's or C's or A's, even if the call to super is in B's method. This is why super() receives the class as a parameter, so it can know from when in the object's mro to start looking up the next method.


"This is why super() receives the class as a parameter" <-- this is not necessary in python 3. Your argument is valid in python2, but python3 allows a super() call without a type argument.

From the docs:

"super(type[, object-or-type])" <-- python2 doc, type is required

"super([type[, object-or-type]])" <-- type is optional in python3


It's a super() good overview of super(). Nice to see a post that uses a real example.


tl;dr

Just looks like super/parent like in other languages. Did i miss something from not reading this?


> Did i miss something from not reading this?

Everything? Mostly the part where, Python being an MI language, a given class A has multiple parents. And these parents are a sequence, not a set, which define a Method Resolution Order, and that method resolution order usually isn't breadth-first either, so Python's super is a graph walker. And because it's explicit, any node in the graph can stop the walking.

This means permuting two parent classes can have significant impact on the semantics of the graph walking, therefore the results of calling super.


Thanks for the short explanation. I totally forget about python being MI.


Some small steps in the direction of a meta-object protocol: more fine-grained control over call dispatch in a multiply-inheriting class hierarchy.


You didn't miss as much as others seem to think. Nothing here changes the fact that it subverts the object model and overuse causes kudzu.


Yeah, OO is too hard.


Yes, Python's super is more like Dylan's next-method than, say, C#'s super.


Yes.


Ah, the "call super" antipattern revived.


How's it an antipattern?


Have to deal with enough languages as it is. Don't want to further add to my brain mess by learning Python 3 syntax just yet.


super() was introduced in Python2.2 many years ago. Everything in the article works with very old versions of Python.

Here's a link to the Python 2 version of the examples: http://code.activestate.com/recipes/577721-how-to-use-super-...


super() was introduced in CPython 2.2, but it changed in Python3: http://www.python.org/dev/peps/pep-3135/. The link you posted uses super as "super(cls, instance)", but the main article uses Python 3's super, which can just be called as "super()", and it figures out the class and instance.

So no, super really has changed, and the syntax in the article does not work with very old version of Python.


> So no, super really has changed, and the syntax in the article does not work with very old version of Python.

Oh come on, he's saying if you do the incredibly minor syntax adjustment that all of the actual meat of the article still works in Python 2.


I upvoted the parent, because that syntax distinction is actually fairly giant, always having to name the class and also changing that on renames etc, is a PITA.

It is actually one of the things that annoys me most about Python, I guess this is the first thing that actually tempts me with Python 3. Damn.


Python 2.x doesn't have parameterless super() so apparently most of the article's code doesn't work with old versions.


Well, you just change to the Python 2 super call form and it works fine. The substitution hardly requires a master programmer...


super(myclass, self) was. Like I posted below, sometimes I have real trouble keeping the 5 or so languages I use straight. The other day I actually forgot what 'None' is in python because I hadn't used it in a few days, and None is different than in the others.


Python 3 was introduced two and a half years ago. Its major syntactic changes, while frustrating if you're set in your ways, are hardly insurmountable.


This. Most of the time I run into an issue with a change I wasn't aware of, a quick search through the 3.X docs just points me in the right direction.


I think it's fine if you're doing python all day long, but I'm using Python, Java, PHP, Obj-C and Javascript each at least once a month on a variety of paid and personal projects. Like I said, I have enough trouble keeping those syntaxes straight.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: