Summary
A response to a blog post by Juergen Brendel pleading for the removal of the GIL.

Advertisement

yesterday, Juergen Brendel blogs at length about the disadvantages of the GIL. He claims it is an architectural decision I'm making that limits his productivity.

I don't expect that this or any response will stop the requests for the GIL's removal, despite a well-reasoned FAQ entry about the issue. But I also don't expect it to go away until someone other than me goes through the effort of removing it, and showing that its removal doesn't slow down single-threaded Python code.

This has been tried before, with disappointing results, which is why I'm reluctant to put much effort into it myself. In 1999 Greg Stein (with Mark Hammond?) produced a fork of Python (1.5 I believe) that removed the GIL, replacing it with fine-grained locks on all mutable data structures. He also submitted patches that removed many of the reliances on global mutable data structures, which I accepted. However, after benchmarking, it was shown that even on the platform with the fastest locking primitive (Windows at the time) it slowed down single-threaded execution nearly two-fold, meaning that on two CPUs, you could get just a little more work done without the GIL than on a single CPU with the GIL. This wasn't enough, and Greg's patch disappeared into oblivion. (See Greg's writeup on the performance.)

I'd welcome it if someone did another experiment along the lines of Greg's patch (which I haven't found online), and I'd welcome a set of patches into Py3k only if the performance for a single-threaded program (and for a multi-threaded but I/O-bound program) does not decrease.

I would also be happy if someone volunteered to maintain a GIL-free fork of Python, in case that the single-threaded performance goal can't be met but there is significant value for multi-threaded CPU-bound applications. We might even end up with all the changes permanently part of the code base, but enabled only on request at compile time.

However, I want to warn that there are many downsides to removing the GIL. It complicates life for extension modules, who can no longer expect that they are invoked in a "safe zone" protected by the GIL -- as soon as an extension has any global mutable data, will have to be prepared with concurrent calls from multiple threads. There might also be changes in the Python/C API necessitated by the need to lock certain objects for the duration of a sequence of calls.

While it is my personal opinion, based upon the above considerations, that there isn't enough value in removing the GIL to warrant the effort, I will welcome and support attempts to show that times have changed. However, there is no point in pleading alone -- Python is open source and I have my hands full dealing with the efforts to produce a quality 3.0 language definition and implementation on time. I want to point out one more time that the language doesn't require the GIL -- it's only the CPython virtual machine that has historically been unable to shed it.

Thanks for posting this - pointing out that again, this is not a limitation of the language itself, but rather a limitation the cPython interpreter "reference" implementation (yet again) maybe help some people realize that this is more appropriate for implementation in a project outside of the "core" cPython team.

Guido, this is a well reasoned response. Please do not become distracted by calls for feature Y or change Z or by criticism X. Too many times, some of us in the community get carried away with marketing hype for feature XYZ that Python lacks, and that language or implementation DEF has, and we fail to notice that it is often the good decisions on what to focus on (and thus what not to focus on) that make Python better over all than the alternatives.

Do we notice that the cPython implementation runs relatively fast and efficiently, that it scales, that it is rock solid? Do we take comfort in the fact that the language is so well defined that it has numerous excellent alternative implementations, each of which offers selective advantages for some uses (Jython, Stackless, PyPy, IronPython, and others)?

Maybe we do appreciate all of the above, but we also at times want Python core development to have unlimited resources and we lust after a language that is clearly superior in every way for every application.

I think this is a situation where there's a leap to implementation details rather than getting clear about the problem.

The problem is that cpython can't use more than one processor at a time, and is thus passing up what might be the biggest opportunity to eliminate the speed argument against Python.

I actually don't think removing the GIL is a good solution. But I don't think threads are a good solution, either. They're too hard to get right, and I say that after spending literally years studying threading in both C++ and Java. Brian Goetz has taken to saying that no one can get threading right.

We do need some kind of solution, but it probably shouldn't be threads. I think a process-based approach is probably best. I'd like to see if it's possible to, from within one cpython instance, easily start up a second one in a different process and easily communicate between them. Then you could use an agent system and the programming would become very easy and safe, while effortlessly making use of multiple processors. And no GIL removal would be necessary.

> We do need some kind of solution, but it probably> shouldn't be threads. I think a process-based approach is> probably best. I'd like to see if it's possible to, from> within one cpython instance, easily start up a second one> in a different process and easily communicate between> them. Then you could use an agent system and the> programming would become very easy and safe, while> effortlessly making use of multiple processors. And no GIL> removal would be necessary.

This begs the same question though: if you want this, come up with a proposed design and implementation. I am not an expert in this area so I need some help beyond "look at this other language's solution". (And arguably you should have posted this in response to Juergen's blog, since that's where the request to eliminate the GIL originated.)

FWIW, I don't think that solving the GIL issue will remove the speed argument against Python -- If a Python program is X times slower than a Java program, using N CPUs doesn't change the factor X -- both the Python version (with the GIL removed) and the Java version will run approximately N times faster, so the speed advantage of Java is still X.

I dislike the GIL, but can understand the reasons for its presence, and why it is hard to get rid of. Ruby would not be an example to hold up, in any case. When I first heard about it, I discarded it almost immediately as it actually used cooperative multitasking (this may have changed since). Erlang would be a better source of inspiration.

As Bruce says, we need to find a solution to the concurrency problem, now that Moore's law is running out of steam for single-thread execution and the trend is moving towards multicore CPUs.

Python needs better batteries-included IPC, i.e. easy to use and included out of the box. Something like the Queue module, except that it allows multiple processes to communicate. There are a number of middleware options available (I use omniORB) but none of them is seamless or included in the standard distribution.

We also need a better inter-process object sharing mechanism. POSH would be great but seems orphaned and last time I tried to use it would just segfault on me. PyLinda is simple, but requires a server process rather than shared memory.

> If you want this, come> up with a proposed design and implementation. I am not an> expert in this area so I need some help beyond "look at> this other language's solution".

I have a similar problem, in that I have a fair bit of understanding of concurrency but not much understanding of Python internals. I wonder if there would be enough interest to try to organize a small conference around this topic to bring together the different fields of expertise that might solve the problem. I've gotten good at organizing small conferences, but not publicizing them, so I'd need help with that.

At the very least, a sprint at the next Pycon, which I'd be willing to organize if there's enough interest.

This issue rises from time to time, because a lot of people are using python with Pylons or Django, and when they hit a performance barier they (understandably) hope putting the software on a shiny new quad cpu server will solve their problems. Only then they learn about the GIL and are frustrated by it.

OTOH, maybe a multi-process, shared-memory QUEUE imlpementation in stdlib will once for all defer this discussions?

> > We do need some kind of solution, but it probably> > shouldn't be threads. I think a process-based approach> is> > probably best. I'd like to see if it's possible to,> from> > within one cpython instance, easily start up a second> one> > in a different process and easily communicate between> > them. Then you could use an agent system and the> > programming would become very easy and safe, while> > effortlessly making use of multiple processors. And no> GIL> > removal would be necessary.

There is a very nice solution that is also mentioned in Jourgen's blog and which i started to experiment with it some time ago: I am talking about parallel python (http://www.parallelpython.com/) that does the described job (distributing work on 2 or more cores) quite nicely.

I want to reply to lots of comments, but here goes with responses to two of them:

Ron Stephens writes: "Please do not become distracted by calls for feature Y or change Z or by criticism X."

I concur with Ron! I tend to err on harsh criticism of new features in Python as it is, but the open letter and surrounding "give me it" chorus from people who claim they have credentials (but don't seem to translate them into code) is not far short of posturing for posturing's sake. If they abandon Python for some other language over a check/tick in some feature box, ignoring Jython, IronPython, PyPy (and so on) in the process, then so be it; if their judgement is that impaired we shouldn't be listening to them anyway.

anthony boudouvas writes: "I would also think that such a module maybe a candidate to be included in next Python version and thus forget about the GIL for a long time..."

Currently the C API design is dependent on the GIL. For example, PyArg_Parse(..., "s", ...) requires the GIL, because otherwise the pointer returned might get garbage-collected before your code has actually looked at it.

By balancing all such calls (i.e. PyArg_ParseGILSafe() would return a cookie that you would eventually pass to PyArgParseGILSafeDone()) C developers could start writing extension modules that are GIL-safe.

(Probably superfluous explanation: the first call would incref all objects all objects to which borrowed references are returned, the second call would decref those. In a non-GIL-free interpreter a bit of preprocessor magic would make these calls do the current thing).

Offer: I'm still looking for a masters project, and while this is rather out-of-scope for my normal work I'd be willing to check whether anyone at the VU could be found to supervise this, but only if it stands a chance of getting accepted into the mainstream....

> Can I suggest a half-way point to removing the GIL> (again:-)?> > Currently the C API design is dependent on the GIL. For> example, PyArg_Parse(..., "s", ...) requires the GIL,> because otherwise the pointer returned might get> garbage-collected before your code has actually looked at> it.

Not really -- your *caller* is holding on to your arguments, and it won't be resumed until you return.

> By balancing all such calls (i.e. PyArg_ParseGILSafe()> would return a cookie that you would eventually pass to> PyArgParseGILSafeDone()) C developers could start writing> extension modules that are GIL-safe.

I'd be worried about error returns forgetting to doing the cleanup, causing the GIL to be held forever (unless you modify the caller to release that lock if it's still held when your code returns).

> (Probably superfluous explanation: the first call would> incref all objects all objects to which borrowed> references are returned, the second call would decref> those. In a non-GIL-free interpreter a bit of preprocessor> magic would make these calls do the current thing).> > Offer: I'm still looking for a masters project, and while> this is rather out-of-scope for my normal work I'd be> willing to check whether anyone at the VU could be found> to supervise this, but only if it stands a chance of> getting accepted into the mainstream....

*This* particular approach doesn't sound quite right, but a project to add GIL-free threading to Python might work. I recommend looking beyond just getting rid of the GIL while keeping the existing thread/threading API though; Bruce Eckel's suggestion of introducing a new API for dealing with threads (actor-based?) might be more promising (and is more likely to find a sponsor amongst your professors :-).