A recent
email thread was brought to my attention which suggested adding
greenlet-style coroutines to the Python standard library. I felt like
this would be a good time to go into why coroutines are a bad idea.

"Readability counts."

Many keyboards have been worn out debating how to make code more
readable, and what affects readability. One of the reasons I've
enjoyed using Python so much is that it doesn't fight (much) my
efforts to write code that's easy to read. Proponents of coroutines,
as used in libraries such as gevent, have claimed that a major
advantage is that they make networking code easier to read, compared
to other concurrency mechanisms such as generators or callbacks. I am
going to argue instead that coroutines make code harder to
read. Before I get into that, I'm going to propose this definition of
readability:

A program is readable when you can look at its code and
understand what it does.

Note particularly that this is different from looking at code and
understanding what the author intended the program to
do. Readability counts most when you're reading code that doesn't work
(such as when debugging) or code that might not work the way it should
(such as when doing a security audit). Designing for readability
means designing for adversarial review of code.

As Mark
Miller and Dave
Herman have pointed out, when first learning to program in a language like Python, there are
some basic assumptions we make about control flow. The main one I want
to talk about here is that it's possible to understand what happens
when you call a function by reading the code of the function.

Consider this trivial example.

self._foo.a = self._foo.b
self._foo.b = baz()

Suppose you want to determine whether any code can see
self or self._foo while its internal attributes are
disarranged — in this case, the time during which its a and
b attributes are set to the same value. Normally in Python
we'd be able to answer this question by reading the source for
baz. However, in the presence of coroutines this isn't
sufficient! If baz, or anything it calls, invokes
something that causes the current coroutine to suspend, then any other
code can be invoked at that point, thus making it impossible to keep
this internal mutation from being exposed.

"In the face of ambiguity, refuse the temptation to guess."

There's many different situations where this sort of problem
arises. In general, any kind of imperative code needs to be able to
preserve invariants for its data structures, while still being able to
do work that might temporarily violate those invariants. This is why
Python has the with and
try/finally structures; being able to
express some level of transaction-like behaviour is useful, so you can
worry about cleanup and invariants at a single place.

These are only useful for operations that aren't extended in time,
however. When using coroutines, it's possible to write code where
finally blocks don't get a chance to run before something
in another coroutine interferes. More distressingly, the
finally block may not run at all! When a coroutine is
suspended, there's no guarantee it will be resumed before the program terminates.

If this sounds a lot like using threads, it's because it
is. Coroutines are a form of threads; they're the foundation for what
are called "green threads" in some language runtimes, such as early
versions of Java and Ruby. The problems with threads are well
documented, and various tools developed to deal with the problems they
introduce, such as mutexes, locks, and queues. Not all coroutine
libraries provide these tools, and the ones that do don't
encourage their pervasive use. The only salient difference in behavior
is that OS-provided threads can be interrupted at more points. On the
other hand, OS threads can be scheduled on multiple processors at
once, providing parallelism. So, in conclusion: coroutines are
strictly worse than threads, because they have the same kinds of
problems (non-determinism, loss of code readability) and do not offer
any unique advantages.

Superior options for concurrency are use of Deferreds to manage
callbacks, or generators. The primary historical objection to
callbacks is the "pyramid of doom", where functions get nested to
ridiculous depths. Deferreds make callback-invoking code composable,
and help flatten out the functions used, as
David Reid has ably shown. Use of callbacks/Deferreds lets you
keep all your normal assumptions about control flow. Invoking a
function can return a Deferred, but it can't do anything to suspend
your code calling it. Once a function is exited, it can't be
re-entered without calling it again. So in a very useful sense,
Deferreds make concurrent code much more readable.

Generators let you keep most of your assumptions, but they add an
extra rule: a function can be suspended and (maybe) later re-entered
when a yield keyword is encountered. This provides the
same amount of information as callbacks, but does enable some cases
that require a good bit more squinting and head-scratching to figure
out.

I believe that better syntax can provide the convenience of generators
and the clarity benefits of Deferreds. More about that in a future
post.