Summary
I have argued many times that multiple inheritance is bad. Is it possible to set it straight without loosing too much espressive power? My strait module is a proof of concept that it is indeed possible. Read and wonder ...

Advertisement

As you may know, for some time I have been running a campaign against
multiple inheritance and mixins - you may want to read the
conclusion of the third article of the series
Things to Know About Super and
the first and the second article of the series
Mixins considered harmful.
Following that route, last week I decided to
release a module I wrote this summer, the strait module, which
implements traits for Python.

The implementation is inspired by the 2003
paper Traits - Composable Units of Behavior. Traits are
simple since they cannot have common methods and the method
resolution order is trivial. There is an implementation of the concept
in the Smalltalk implementation Squeak.

The strait module was written as a design exercise, to prove a few points:

that you can replace multiple inheritance with a less powerful but also
simpler and less dangerous mechanism, without losing much
expressive power;

that a language such as Python is powerful enough that you can implement
traits in 100 lines by using single inheritance only;

that you can keep a kind of method cooperation even using traits,
but in a simpler way than using multiple inheritance, basically
by setting straight the original hierarchy.

The documentation of the strait module is intended for language designers,
framework writers and advanced Python programmers. It actually was
written for the guys of the python-dev list, as a companion to
a thread about my articles on super. It is not intended for
the average Joe programmer, and it is somewhat technical, focusing
on the details on the Python implementation. On the other hand,
knowing that alternatives to multiple inheritance and mixins exist
in my opinion is good for everybody.

Thus, I have decided to supplement the documentation of the strait
module with a few notes explaining what traits are, the differences
with multiple inheritance and mixins and what we mean by method
cooperation. The notes here are intended for any programmer with
experience in OOP, and they are not Python specific at all.

Multiple inheritance, mixins and traits are usually considered
advanced techniques of object oriented programming, since the most
popular languages (Java, C#, VisualBasic, PHP) do not support them, or
support them in a poor way (C++). On the other hand, those techniques
are pretty common in the coolest languages out there, such as Python
(featuring multiple inheritance), Ruby (featuring mixins) and Scala
(featuring "traits"). I am quoting the term "traits" when referred to
Scala, since Scala traits are more similar to Python mixins
than to Squeak traits. Actually, Scala traits can be composed when
they override the same method and the order of the composition
determines the resulting pattern of super calls: that means that Scala
traits have basically all the complications of Python mixins, which
I would rather avoid.

Multiple inheritance is the most general technique among the three
cited before: mixins can be seen as a restricted form of multiple
inheritance and traits as a restricted form of mixins.
Multiple inheritance is available in various languages, such as
C++, Common Lisp, Python, Eiffel, and others.
In a multiple inheritance language, a class can have more than one parent
and thus can inherit methods and attributes from more sources at
the same time. Maintaining code taking advantages of
(multiple) inheritance is nontrivial, since in order to understand how
a class works, one needs to study all of its parents (and the parents
of the parents, recursively).

That means that there is a strong coupling of the code:
changing any method in any
ancestors has an effect on the class. To some extent this
is is inevitable, since the other face of code reuse if code coupling
(you cannot have one without the other) and one has to cope with that.
Also, you have the same problem
even with single inheritance, when you have a deep hierarchy.
However, multiple inheritances adds another level of complication.

For instance, the order of the parents
is significant: a class C1 inheriting from P1 and P2 does
not necessarily behave the same as a class C2 inheriting from P2 and
P1 where the order of the parents is inverted.
The reason is that for common methods, i.e. methods with the same name,
the methods of P1 have the precedence over the
methods of P2 for the class C1(P1,P2), but not
for the class C2(P2,P1).
Since the common methods are silently overridden and programmers are not
quite good at remembering the ordering, that may give raise to subtle bugs.

The situation is worse if one looks at the higher order ancestors:
the order of overriding (the so called MRO, Method
Resolution Order) is definitely non trivial:
I actually wrote a long essay on the subject, describing
the Python MRO and I refer to that reference for the details.
While that reference is Python specific, the concept of method
resolution (also called linearization in the Lisp world) is
general and applies to many languages, including Dylan and Common
Lisp.

If you want to know more about the linearizations
of Dylan and Common Lisp, you should look at this paper. On the other
hand, if you are a reader of my Scheme series or a Scheme practitioner,
I suggest you to read the
paper Scheme with Classes, Mixins, and Traits, which describes
the object system of PLT Scheme, which support both cooperative
mixins and traits in the Squeak sense.

Scala does not support full multiple inheritance, but its traits
are nearly as powerful (the only difference between a trait and a
regular class is that the trait does not define a constructor)
and nearly as complicated, therefore I would consider Scala in
the same class with Python and Common Lisp. If you
want to look at how Scala works, you may look the Scala Overview
paper. Theoretically, the Python MRO is the best one,
since it is monotonic, but in practice all MROs are quite complicated.

The point to notice is that the complication of the MRO is by design:
languages with a non-trivial MRO where designed this way
to make possible method cooperation via super calls. That means
that if both parents P1 and P2 define a method m,
a child class C can override it and still have access to
the m methods of the parents via super: C.m will
call first P1.m and then P2.m, if P1.m features
a super call itself.

Of course, this
is just one possible design: different languages may adopt different
designs. For instance the Eiffel language implements multiple
inheritance, but it raises an exception when two methods with
the same name are present: the programmer is forced to specify an
explicit renaming (this is basically what happens for traits).

Years ago, I thought such a design to be simplistic (even stupid) and
very much inferior to the Python cooperative design: nowadays I have
had more experience with real life large object oriented systems using
multiple inheritance and I have come to appreciate "stupid"
designs. Actually, nowadays I think Smalltalk made the right choice
thirty years ago, deciding to not support multiple inheritance nor
mixins.

In practice, the overriding problem is not very frequent (it is serious
when it happens, but it rarely happens) since usually frameworks are
designed to mix independent sets of functionality. Usually one does
not need the full power of multiple inheritance: mixins or traits are
powerful enough to implement most frameworks.

In a language with multiple inheritance it is natural
to implement mixins as classes.
However, this is not the only solution. In general, we can speak
of mixin programming in any language where it is possible to inject
methods in the namespace of a class, both statically before class creation
or dynamically after class creation.

For instance,
Ruby does not support multiple inheritance, bit it does support mixins
since it is possible to include methods coming from a module:

class C_with_mixin < C:
include M # M is a module

There is an advantage in this approach: modules have no parents and
there is no concept of method resolution order, so it is much easier
to figure out what a mixin does, as compared to figure out what
a mixin implemented as a class in a multiple inheritance hierarchy
does. On the other hand, there is no method cooperation in the sense
of Python or Scala super or CLOS call-next-method.
There is a limited
cooperation between parent and sons only, since Ruby super
(like Java super), is able to dispatch to the parent
class only.
This is not
necessarily a bad thing, though.

Ruby mixins are much simpler than Scala traits or Python mixins,
but they still suffer for the ordering problem: mixing the module M1 and the
module M2 is different than mixing the module M2
and the module M1: if the modules contain methods with
the same name, changing the composition order affects the
resulting class.

Traits were invented just to solve this problem: common methods
raise an error unless the programmer specifies the precedence
explicitly, or she renames the methods. After that, traits commute.
Traits are therefore the most explicit and safest technique,
whereas multiple inheritance is the most fragile technique,
with mixins in between.

A proper implementation of traits should also include introspection
tools such that a class can be seen both as a flat collection of
methods and as a composite entity (the original paper about traits
explain this point pretty well). That should help with the namespace
pollution problem by giving to the developer the ability to see the
class as a composition of traits (one could argue that in Python pydoc
allows you to see the origin of the methods as coming from parent
classes, but that support is insufficient to manage situations with
complicate inheritance hierarchies and lots of methods).

In Python you can also implement mixins without inheritance simply by
dynamically adding methods to a class, starting from a method dictionary
M:

class C_with_mixin(C):
pass
for name in M: # M is a dictionary of methods
setattr(C_with_mixin, name, M[name])

Implementing Ruby mixins in Python is therefore trivial, you can
just read the methods from a module dictionary. Implementing
traits is a bit less trivial, since you must check for common
names and raise an error in that case. Moreover you must be
careful with ordering issues: the traits paper says that
methods coming from a trait must take the precedence over
methods coming from the base class, but they must not take
the precedence over methods defined in the class.

I took some liberty with my own implementation of traits,
which was just inspired by the Squeak implementation, but
it is not the same. In particular, I added some support
for cooperation of traits, i.e. there is a kind of super,
but its functionality is limited with respect to the regular
super, and it is a bit more akward to use: this is on purpose,
to discourage designs based on method cooperation, which I
think are fragile and not to be recommended (again, see the
third article on my series about super).
Still, in the
very few special case where one wants cooperation, that is
possible indeed.

All those points are explained in the documentation of the strait module,
so you should look there, if you are really interested in the subject.
Here I will just add that I still prefer generic functions to traits.
Nevertheless, traits may have a role if you want to follow the traditional
route of having methods inside classes, and they are a smaller leap
from traditional object oriented programming. Moreover, it is much
easier to convert a pre-existing framework from using mixins to
using traits than to convert it generic functions.

If you want to play with traits in Python, you are welcome to try the
strait module. Enjoy!

Post scriptum. I realize that this post may be misinterpreted, so
let me make clear a couple of points:

I am not asking for removing multiple inheritance in Python and
replacing it with traits. However, I am saying to people writing
new languages: think twice before adding multiple inheritance.
Certainly it is more difficult to implement than traits; moreover,
I am arguing that it makes life more difficult for yours users too.

I do not think traits are the best thing after sliced bread. They
are a bit better than multiple inheritance, but I still recommend
to keep things simple and to use both (single) inheritance and
traits as little as possible.

Michele,I've been following your posts via the Artima RSS feed. When I first read your posts on mixins being harmful I was intially dismissive. I generally don't like arguments against language features based on the fact that you *could* shoot yourself in the foot. Being someone who formerly developed in Java and currently does so in Python, I know the pain of getting to a leaf case in programming, aka the exception to the rule, where it does make sense to use features/approaches that can be otherwise abused and not being able to do so because the technology I am using prevents me from doing so. Mixins, like GOTO, are easily and often abused, but with experience comes the ability to recognize the cases when using these techniques makes sense. I don't like the idea of forcing people with such abilities to work harder in order to accomidate those that don't already have such experience. I'd argue that its the responsibility of those on a development team to mentor those with less experience to learn the exceptable uses of such techniques.

That being said, I'm growing more interested in your ideas after reading this post. It almost seems like you're starting to develop a new programming paradigm based on OOP. I find this much more intriguing and constructive then typical complaints about OOP on the internet that come across more as a rant along the lines of "A guy I worked with abused this language feature which made my life difficult so no one should ever use it". I myself have been guilty of making such generalizations in the past, and admittedly, I dismissed your first post about mixins as being along the same line, though the 'Considered Harmful' title did make it easy to be dismissive. I am now starting to come around, and I'm looking forward to hearing more of your ideas. I've always wished that there was a language/programming paradigm whose very nature discouraged feature abuse like you've noted about mixins but still made them easy to use in the cases that they are warranted. I don't know that this is a goal of yours or if its even possible, but I can always hope.

I've been developing a model that's closely related to your traits model, but with the following differences:

1) The methods directly defined in the class constitute a trait, so if there are name conflicts between methods incorporated from a trait and methods in the class, it's an error, just as in the case of methods incorporated from two different traits. Furthermore, all classes smell the same: any class can be incorporated as a trait into any other class, provided there are no loops in the graph.

2) Instance variables, unlike methods, are trait-local, and a compiler should warn if a method does not access at least one instance variable (which conceptually means that it belongs in a different trait, perhaps one with no instance variables).

3) There is no such concept as a base class, although there is subtyping in the style of Java interfaces: a class may declare that it is a subtype of one or more classes, which means that its public methods are a superset of the public methods of the supertype classes (but do not in any way share implementations with them unless they are explicitly incorporated as traits).

4) There is nothing like super; if you want to call a method in a trait that has the same name as you do, rename it and call it explicitly.

I have not yet decided if incorporating a trait is transitive or not: what's your view?

I saw this coming a mile away. "Mixins blew up on me once, so let's overcorrect with Traits!". I've seen it before, and I'm sure we'll see it again.

Anyway, the problem with Traits is that all you're doing is replacing static typing with static composition. In a static language like Scala that's fine. But in a dynamic language, it's completely out of place.

> It almost seems like you're> starting to develop a new programming paradigm based on> OOP.

No, no, I am not doing anything original or particularly deep, I am just popularizing ideas which have been around for a while. Traits are not a new paradigm, they are a kind of multiple inheritance for the poor man, but I am arguing that they are enough.

> I've always wished that there was a language/programming> paradigm whose very nature discouraged feature abuse like> you've noted about mixins but still made them easy to use> in the cases that they are warranted. I don't know that> this is a goal of yours or if its even possible, but I can> always hope.

Me too. The strait module is marked as experimental. It is there for people which are working on a framework based on mixins and wanting to try traits instead. Will traits be enough for them? Will traits make a difference for users ofthat framework? I hope so, but only real life code can give us the answer. So, please try it and give me feedback!

> Anyway, the problem with Traits is that all you're doing> is replacing static typing with static composition. In a> static language like Scala that's fine. But in a dynamic> language, it's completely out of place.

Honestly, I haven't a clue about what you are talking about.Care to be more explicit?

> I have not yet decided if incorporating a trait is> transitive or not: what's your view?I am not sure of what you mean by transitivity: perhaps that if a class B incorporates a trait T and C is a subclass of B, then C incorporates T as well? That seems sensible behavior to me and my implementation works this way. My approach is not symmetric though: I can compose a base class with one or more traits, but what happens when I try to compose two traits without a base class is undefined (I may decide to raise an error, I haven't decided yet). So I do not worry about questions like "if trait T1 incomporates trait T2 and T2 incomporated T3, does T1 incomporates T3 as well?" Anyway, in a symmetric system like the one you are suggesting I would expect an affermative answer.

> I am not sure of what you mean by transitivity: perhaps> that if a class B incorporates a trait T and C is a> subclass of B, then C incorporates T as well? That seems> sensible behavior to me and my implementation works this> way.

That seems reasonable for you. For me, there is no base class - superclass relationship with inheritance: incorporating a trait is the only kind of inheritance there is, and superclasses are just supertypes.

> My approach is not symmetric though: I can compose a> base class with one or more traits, but what happens when> I try to compose two traits without a base class is> undefined (I may decide to raise an error, I haven't> decided yet). So I do not worry about questions like "if> trait T1 incorporates trait T2 and T2 incomporated T3,> does T1 incorporates T3 as well?"

Fair enough.

> Anyway, in a symmetric> system like the one you are suggesting I would expect an> affermative answer.

The objection is that just because you bring in the methods defined in T1 *considered as a trait*, you don't necessarily want the methods that T1 *considered as a class* incorporates into itself. So I think I'll say that incorporating T1 brings in just the methods defined in T1, and that if you want the methods of T2, you know where to find them.

The strait module looks very interesting, and I am intrigued. But I am also puzzled. In reading the documentation for the straight module, I am having trouble seeing how it will help with name-space pollution. Taking the plone example you gave, if that system were redone using traits, would not the end result still be an object with 600+ attributes, methods, etc.?

> if that system were redone> using traits, would not the end result still be an object> with 600+ attributes, methods, etc.?

Yes, traits are a compromise for people wanting something similar to multiple inheritance (even though simpler and more controlled). They do not solve the namespace pollution problem. To solve that, you need an approach based on composition (see my posts about mixins).

> Yes, traits are a compromise for people wanting something> similar to multiple inheritance (even though simpler and> more controlled). They do not solve the namespace> pollution problem. To solve that, you need an approach> based on composition (see my posts about mixins).

From what I've studied so far, traits are better than a compromise. ;)

Thanks for the tip about using import inside the class structure -- I had not realized I could do that. I love Python!