Python has a GIL, and lots of complainers

If you don’t know who Brett is, you’re probably not a heavy Python user. Brett is a very important Python core developer which has been around for a while and who does a great job at it. His post, though, makes me a bit sad.

Brett points out that there are two types of personalities which do not contribute to open source. The first one he defines as:

The first type is the “complainer”. This is someone who finds something they don’t like, points out that the thing they don’t like is suboptimal, but then offers no solutions.

And the second one is defined as:

(…) This is someone who, upon finding out about a decision that they think was sub-optimal, decides to bring up new ideas and solutions. The person is obviously trying to be helpful by bringing up new ideas and solutions, thinking that the current one is simply going to flop and they need to stop people from making a big mistake. The thing is, this person is not helping. (…)

This, on itself, is already shortsighted. If you’re tired of hearing the same arguments again and again for 10 years, from completely different people, there’s a pretty good chance that there’s an actual issue with your project, and your users are trying in their way to contribute and interact with you in the hope that it might get fixed.

This is really important: They are people, which use your project, and are trying to improve it. If you can’t stand that, you should stop maintaining an open source project now, or pick something which no one cares about.

The other issue which took my attention in his post is his example: the Python GIL. Look at the way in which Brett dismisses the problem:

(I am ignoring the fact that few people write CPU-intensive code requiring true threading support, that there is the multiprocessing library, true power users have extension modules which do operate with full threading, and that there are multiple VMs out there with a solution that have other concurrency solutions)

I can understand why you think this way, though. Guido presents the same kind of feeling about the GIL for a very long time. Here is one excerpt from a mail thread about it:

Nevertheless, you’re right the GIL is not as bad as you would initially think: you just have to undo the brainwashing you got from Windows and Java proponents who seem to consider threads as the only way to approach concurrent activities.

Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.

I apologize, but I have a very hard time reading this and not complaining.

In my world, the golden days of geometric growth in vertical processing power is over, multi-processed machines are here to stay, and the amount of traffic flowing through networks is just increasing. It feels reasonable to desire a less naïve approach to deal with real world problems, such as executing tasks concurrently.

I actually would love to not worry about things like non-determinism and race conditions, and would love even more to have a programming language which helps me with that!

Python, though, has a Global Interpreter Lock (yes, I’m talking about CPython, the most important interpreter). Python programs execute in sequence. No Fork/Join frameworks, no coroutines, no lightweight processes, nothing. Your Python code will execute in sequence if it lives in the same process space.

The answer from Brett and Guido to concurrency? Develop your code in C, or write your code to execute in multiple processes. If they really want people to get rid of non-determinism, locking issues, race conditions, and so on, they’re not helping at all.

I know this is just yet another complaint, though. I honestly cannot fix the problem either, and rather just talk about it in the hope that someone who’s able to do it will take care of it. That said, I wish that the language maintainers would do the same, and tell the world that it’s an unfortunate problem, and that they wished someone else would go there and fix it! If, instead, maintainers behave in a ridiculously dismissive way, like Guido did in that mail thread, and like Brett is doing in his post, the smart people that could solve the problem get turned down. People like to engage with motivated maintainers.. they like to solve problems that others are interested in seeing solved.

Perhaps agreeing with the shortcomings won’t help, though, and no one will show up to fix the problem either. But then, at least users will know that the maintainers are on the same side of the fence, and the hope that it will get fixed survives. If the maintainers just complain about the users which complain, and dismiss the problem, users are put in an awkward position. I can’t complain.. I can’t provide ideas or solutions.. I can’t fix the problem.. they don’t even care about the problem. Why am I using this thing at all?

36 Responses to Python has a GIL, and lots of complainers

I share your sentiment about the GIL, even wrote an open letter to Guido once. Anyway. that’s one of the reasons that RESTx (an open source project for the easy creation of RESTful resources) is based on Jython. That kills two birds with one stone. Three, actually:
(1) The GIL is not a problem. If you have two Python threads, they actually run at the same time.
(2) You can drop into Java for optimization of CPU intensive code. This is actually pretty neat, since Java/Python integration in Jython is incredibly simple and Java performs surprisingly well.

For RESTx this also means that you can write your custom components in the language of your choice (or which is best for the task).

I think the point that Brett wanted to make (or at least the point I wanted Brett to make) is that you’re welcome to complain, but you shouldn’t be surprised if your complaints, without any offer of work or assistance on your part, aren’t taken very seriously — *especially* when there are solid (and well developed) reasons why the complaints haven’t been addressed by the existing developers. Yes, I see you disagree with those reasons, but I don’t think it’s fair to argue that they must change their minds just because you disagree.

This is open source. For better or for worse, if you want input, you have to do the work. To do otherwise is to disrespect the people that have put in the time, and work, to deliver their free, high quality software to you (wrt Python, at least ;).

Please, let’s not refer to the GIL as a “design flaw”. It is a design decision.

If I have the history correct, at one time Greg Stein produced a prototype non-GIL implementation but it was not felt to justify the performance penalty. Times have changed, and it’s unlikely that anyone is going to do a similar project again. But I wasn’t following the dev list back in those days (and don’t always now).

If you have 16 processors then being able to use them all at 60% of normal speed is still an overall win against one processor at 100%, but alas it’s not just an option you can set or unset at run-time.

It’s actually encouraging that people are using Jython (and IronPython) for these tasks. That keeps the ecosystem healthy and ensures biodiversity.

Yes, I see you disagree with those reasons, but I don’t think it’s fair to argue that they must change their minds just because you disagree.

You miss the point of my post, Titus. I don’t disagree with the reasons why the problem has not been fixed, and I don’t want you or Brett to fix the problem, if you don’t want to. I do relate as stated in the post to the way that Guido and Brett dismiss the wart, and dismiss the people that find the wart unfortunate enough to present ideas, solutions, or just dissatisfaction, though. These people are your users.

This is open source. For better or for worse, if you want input, you have to do the work. To do otherwise is to disrespect the people that have put in the time, and work, to deliver their free, high quality software to you (wrt Python, at least).

I’m one of those people, Titus, and I appreciate the people that care enough about Python to complain about its warts. Ignoring an email, or responding with a standard template, is pretty cheap in comparison to a blog post, or to lose developers.

You can pick and choose you’re posts. It’s also been said that we’d be happy to consider any patch to remove the GIL which maintains performance for single threaded programs and doesn’t break the extension API. So, far it hasn’t been done, so more complaining and meta-complaining ensues…

Go doesn’t exactly use coroutines. They call them “goroutines” for a reason.

And there are coroutines in CPython: greenlet.

Go’s notion of coroutines actually execute in parallel, Scott, and greenlet’s don’t seem to get even close to this:

Greenlets can be combined with Python threads; in this case, each thread contains an independent “main” greenlet with a tree of sub-greenlets. It is not possible to mix or switch between greenlets belonging to different threads.

Admittedly I was a little flippant with the GIL remarks, but they were meant as an example, not to flat-out say that I or other core developers do not know the GIL is an issue.

My point is that we have acknowledged the GIL is a problem and that we do not have a good, backward-compatible solution to solve it. That point has been made for years, over and over again. It has been made on python-dev, it has been made in blog posts. The problem is known, but we do not have a solution.

And yet people keep coming to us telling us it’s a problem. That does not contribute to the conversation. The core developers have said we think improving the language and the standard library is more important than trying to come up with a GIL solution actively. Obviously if someone is inspired to derive one that’s great, but we are not going to make it a goal for some specific release, nor worry about an approach if it cannot be made to work with existing extensions. Obviously some people disagree with us, but this is a subjective judgement call the core developers all made individually.

So when people come to us — sometimes calmly, sometimes spitting venom — saying that we have made the wrong call, that doesn’t really move the conversation forward. We all know what the benefits of losing the GIL are. We know that multi-core chips are becoming more and more common. We know that OSs are taking scheduling more seriously. In the face of all of this, we still feel that there are more important things to worry about and focus our time on.

I guess this boils down to subjective vs. objective. Subjectively people can disagree with us about the importance of the GIL, but we can subjectively agree to disagree as well. When you reach that kind of impasse, you need an objective solution to move things forward, else it just comes off as people complaining about an acknowledged problem.

Here’s another analogy to go with: kids wanting more allowance. Your child might go “I want more money.” As a parent you might legitimately say “I don’t have the extra money”. The kid might then retort that they need more money for this, that, and the other thing. That’s fine, but that doesn’t change the fact that you as a parent still are not suddenly making more money. Sure, you could probably shift some spending around to get more money for your kid, but that does not necessarily line up with your priorities (e.g. mortgage comes first). So having your kid constantly trying to justify why they need a larger allowance just doesn’t constructively help you find more money for them while continuing to hold to what you consider more important use for the money you are currently making.

That’s what the GIL example was meant to get across: asking for something that you deem important does not mean I think it warrants shifting my priorities to address your issue. Unless you can come up with a solution which addresses my concerns or can come up with some fundamentally new and groundbreaking reason for me to change my priorities (which rarely happens past the initial discussion of an idea in my experience), the conversation is simply not going to move forward in any constructive way. In that case the conversation is moving in a parallel fashion and that’s just a waste of everyone’s time and energy, especially when that time is very finite and in short supply.

Please, let’s not refer to the GIL as a “design flaw”. It is a design decision.

Sure, Steve. We can call it even a feature if you want. :-)

If I have the history correct, at one time Greg Stein produced a prototype non-GIL implementation but it was not felt to justify the performance penalty. Times have changed, and it’s unlikely that anyone is going to do a similar project again. But I wasn’t following the dev list back in those days (and don’t always now).

Thanks for bringing this up. I had memories about these conversations, but didn’t recall who was the proposer.

Thanks for your feedback, Brett. I appreciate it, and do understand and agree with most of what you say. Here is just one remark, which maps well to what I try to point out in my post, using your analogy:

If you get more children, do expect to teach them again, even if you have done a great job with you first child. You’ll be more experienced after a few of them, and may do better answers without much effort after a while, but you’ll have to teach them again.

Some parents may one day even get sustained by their children. Go figure.

But your kids do grow-up, meaning they get better over time. We don’t get that benefit online. Best we can do is ask people for due diligence by using Google to see if their issue has come up before. For something like the GIL, that’s easy enough to find. So having it come consistently really wears on you. This is why you have to get new blood into a project; you have attrition because people burn out from people yelling at them all the time for the same stuff.

Maybe this is why I have no desire at the moment to have children (and that might be considered a good thing by some). =)

Brett, Python developers definitely get better over time as well. You’ll get more of them, though (hopefully), and that’s what causes the repetition you unfortunately see. That said, the ideal scenario is that, one day, one of those whiners will really fix the problem in an elegant way. If you dismiss these attempts, you may be throwing away that chance.

You seem to think that someone can just come along and fix it – as if no one has tried. The reality is that many really smart people have looked at eliminating the GIL over the past decade and failed.

I don’t just think it, James. I’m sure about it. There are tons of languages out there without a GIL. It may need a significant effort, and it may need compatibility changes, but it’s far from being an unsolvable problem. It might actually already be fixed by now if we hadn’t been telling people that they were brainwashed by Windows and Java proponents.

Removing the GIL has been done, so technically, it has been solved. It was just determined to not be as useful/as good as the existing with-GIL implementation. If someone is able to do it better, awesome.

That said, as someone who has written threaded software heavily, pretty much all of my threaded software that I don’t hate is written using queues and messages. I’d be willing to wager that this is not an uncommon experience. And even better, queues and messages can be mapped into the multiprocessing library.

Indeed, the multiprocessing library isn’t a 100% solution (probably closer to an 80-90% solution), but I would personally argue that a no-GIL Python is also not a 100% solution either (maybe 90-95%) , primarily because it allows us to do all of the stupid things that we were doing with threads in the first place (I’ve done just about everything stupid with threads that there is to do, and have learned to stop doing them). But what makes multiprocessing a great solution is that it is available today. No waiting, no patching, no worrying, it’s right there in the standard library just begging to unleash the power of your 8-core i7.

I’m not trying to discount the desire to go GIL-less. I want it too. But having gone down the rabbit hole to try to design something different, having looked at fine-grained locking, … I personally don’t have the drive to worry about it anymore. I just build my software and systems with the features of the CPython runtime in mind.

One thing to consider is that there are more variables than simply how many processors a process can use. If a language can hide the fact that each thread can run on different processors that is very helpful, but when you have to scale across multiple machines (for any variety of reasons including load, cost, locale, etc.) the GIL doesn’t really matter. I would argue as well that the situation where the GIL matters (computing on a single machine) is not necessarily where the majority of users actually develop. I know there are plenty of cases where Python is used on the desktop, but my gut tells me more people use Python for things like web development than writing the next Photoshop or ProTools. In the case of web development, the protocol requires a stateless design, which makes things like Python application servers well suited for such tasks. The GIL in fact (I’d argue) is a feature in that it forces the developer to consider the application state, and in turn complexity, within the confines of a single process.

I’m not suggesting I have empirical evidence or a mass of knowledge about the majority of Python developers. But, in my own work I’ve never really hit a problem with the GIL. There has been plenty of data and processing to do where improving the speed is helpful, but in every situation, the answer has been completely achievable by improving the algorithms and design. Again, I’m not saying it isn’t a problem, just that in terms of priority, it doesn’t rank very high.

One more thing to consider is the mobile space. I feel a lot of conversations regarding the GIL revolve around the fact that machines are getting more processors. But when you consider most servers end up needing a design that doesn’t depend on a single process and the mobile platforms will most likely be single processor for the foreseeable future, it makes me question where the problem really is.

Finally, I really don’t think people choose to use open source projects based on some sense of how the community reacts to problems. Most of the time people use projects based on what gets the job done. Python excels at this kind of practicality. While I can see your point that in a perfect world a project would be able to accurately calculate needs based on how loud users are complaining and cross reference that with real world use cases and history, the fact is that is not going to happen. As a design decision, the GIL has been really successful. It has simplified many things and while we are entering a different world of computing, focusing on the GIL as the real problem doesn’t address a real world need as much as it reflects programmers personal desires for speed. The “premature optimization is the root of all evil” quote I believe hits home because there is probably not a programmer around that can’t look back at a project and realize how some silly concern for speed was actually a productivity killer.

My theory is that when the GIL becomes a real problem for enough people, whatever pain it will take to find a solution will be endured for something better. The fact is I don’t think we’ve hit that point and thanks to new platforms like the iPad and its obvious tie into online services, the pain may never come. Personally, I’m totally fine with that.

Just to be clear, I’m no one important in the world of Python. I use it everyday (thankfully!) and that is about it. These are just my experiences and opinions that anyone is free to ignore because, as this entire topic focuses on, I don’t contribute any actual code.

tl:dr; Just because the squeaky wheel gets oil doesn’t mean we should just oil the wheels. Let make a freaking flying car or space transporter system and leave the oil in the ground ;)

The problem is known, but we do not have a solution.
And yet people keep coming to us telling us it’s a problem.
That does not contribute to the conversation.

I think it does in some way, it helps the community not to forget about an acknowledged problem. And the solution shall come in one way or in another! :-)

Users should use, help as they can, complain and be grateful. Developers
should develop, prioritize responses to complains and be happy about the
popularity of their product. They should also tolerate complainers to the extreme.
The number of complains is, after all, both an indicator of popularity and a problem.

To quote David Beazly:

Improving the GIL is something that all Python programmers should care about.

To put Brett’s post in a narrow context: there was a particular person on the mailing lists sucking up everyone’s time and _demanding_ that the hundreds of volunteers in the Python community immediately scratch his itches in the _just so_ way he likes (you subscribe to those lists, and probably saw it). Even if he was 100% correct he was 100% counter productive.

To put it in a broader context: Poseurs love difficult to solve problems, perennial problems even better, and impossible problems the best. That way they can learn one thing to gripe about and they can always be in the right by declaiming it without ever having to do anything or – God forbid – providing a novel solution. So things like the GIL get a disproportionate share of shouting from the know-nothings. I still see crazed Ruby fanbois (who know almost nothing about Ruby either) proclaim that Python isn’t OO.

I agree with Guido.
Threading to do cpu intensive work in python is something which I do not find my self hitting very often – if ever. I think that if you want to do that kind of threading – then you probably should write it in c and use something like openmp to start with. I do see the point in having say a ‘light for cpu intensive process thread’ added to python.

I should also add: I have coded in c using openmp and java – with plain java threads – (both) for some time now.
Sure I an handle mutex, semaphore, race condition, debugging threading, deadlocks, sleeping blocking threads, livelocks, ….. etc. – but can you ? do you really want too?

Eric, thanks for your comment. You make some good points there, and I think it reflects the feeling of many Python users.

(…) but when you have to scale across multiple machines (for any variety of reasons including load, cost, locale, etc.) the GIL doesn’t really matter. I would argue as well that the situation where the GIL matters (computing on a single machine) is not necessarily where the majority of users actually develop. (…)

This is certainly true, but please realize that it reflects a biased view. The existing practices around Python reflect the capacity of the interpreter. Just as a useful exercise, look at Erlang and try to understand the relationship that developers have to concurrency, and the designs that are used even in web servers.

But when you consider most servers end up needing a design that doesn’t depend on a single process and the mobile platforms will most likely be single processor for the foreseeable future (…)

One thing to keep in mind: the GIL is an integral part of the CPython extension API. Sure, we could stub out the GIL API calls. But removing the GIL really means replacing it with something else. That new facility would have its own API, and extension modules would have to use the new API or they wouldn’t work. I don’t think we could break every extension in a point release. So this suggests GIL removal will have to wait for Python 4.

At which point I bet we’ll also get rid of reference counting and go with strict garbage collection. And maybe by then PyPy will be the reference implementation. And we’ll all have flying cars.

Brett,
I hav seen the prior arguments about concurrency and the GIL. Personnaly I am wary of threading and do my parallel work via xargs and Platform computings LSF.

I would just light to say again that you guys do a great job developing Python and it *is* appreciated. It isn’t normal to post praises, but I am sure you have a lot of silent admires of your work who agree with not bleating unless they think they are part of a solution.

It looked very promising. From the outside it looked like the author of the other solution wasn’t happy to step aside when a better solution came along.

In it’s current form, it seems like the GIL doesn’t actually perform properly, sucking away cycles on multicore.

The whole thing is disappointing, I’m looking forward to a time when we can run fast python in Pypy and move on from saying that certain kinds of program are not suited to the language: the language should be fixed, not the programmes change.

I’m amazed how much people still think that shared mutable state is the way to go, although there are zillions of projects out there in hunderds of languages with the same problems.
There is a good chance that 20+cores don’t have one big shared memory and then the so called GIL problems vanish anyway.

I’m amazed how much people still think that shared mutable state is the way to go, although there are zillions of projects out there in hunderds of languages with the same problems.

The GIL is far from solving these problems, because it does not prevent shared mutable state between threads. What solves these problems is designing applications and frameworks which enable people to more easily reason about concurrency, and there are very good patterns evolving which enable people to reason more easily about concurrency in the same process space. When I see people claiming for concurrency, I don’t naïvely assume they’re designing something improperly.

There is a good chance that 20+cores don’t have one big shared memory and then the so called GIL problems vanish anyway.

This statement only makes any difference if you think that we’re going back to a model where each core works in isolation, but reality is that having several cores with access to shared memory isn’t going away.

So, wouldn’t it be possible to have ‘new-style’ C extensions that work without a GIL, and, just have it be known that you can’t use new-style and old-style C extensions in the same program; when one detects the other, the program barfs. Everyone loves new-style, and begins to rewrite their extensions in new-style. Eventually all the important stuff is new-style, and everyone’s happy.

Or a GIL-less version of CPython that needs new-style C extensions. Only the most-used C extensions are ported at first, but more catch on, distros ship both, and soon the T(hreadable)CPython becomes the de-facto one.

I think the error in this thinking is “Would you rather have users, or have no complainers” which is a fallacy of the undistributed middle. Truth is – you can have users without tolerating a culture of complaint.

And when did core developers downplay or dismiss the limitations of the GIL? This has been well documented and discussed endlessly.

Complaining is NOT contributing. Neither is trotting into the middle of something and just dropping off your ideas. That is not helpful. I’ve been involved in Open Source for over a decade and I’m solidly with the Brett regarding his points.

In my experience [and opinion] people who are serious about contributing to a project – and they are the ONLY PEOPLE WHO MATTER – will make an effort to discover the proper channels for contributing. Casual interlopers, big-thinkers, and visionaries are nothing more than distractions; they fatigue discussions and give the discussions an exaggerated sense of negativity, in addition to just soaking up time [replying to their messages].

And when did core developers downplay or dismiss the limitations of the GIL?

This very post presents when.

In my experience [and opinion] people who are serious about contributing to a project – and they are the ONLY PEOPLE WHO MATTER

I respect the fact that you have a different opinion and experience than I do. I value people’s time a lot more, and have great experience with receiving contributions from people that at first were unable to help. These facts are probably related.

Complainers are important for a couple reasons you do not mention. Complaints raise awareness about design decisions and the impact they might have. End users are exposed to the underlying design of any system they might never have seen. Secondly, without user feedback, any application becomes what it is based only on the developers’ view of the world. If users show dissatisfaction with a design decision, it can reset the developers’ world view and influence the world view of developers on other projects even if the subject of the original complaint is left unchanged. Would a new languages today trying to compete in Python’s space be competitive if it too followed the design decisions of Python? New languages arise because they address the complaints of existing ones.

=====
If the maintainers just complain about the users which complain, and dismiss the problem, users are put in an awkward position. I can’t complain.. I can’t provide ideas or solutions.. I can’t fix the problem.. they don’t even care about the problem. Why am I using this thing at all?
=====

I completely agree with the point. And I have seen so many times this “you-actually-don’t-want-it” in my life that I don’t event want to comment on comments.

Again, that’s a very good point: if so many people are complaining about it — probably there should be one more ad “we’re in trouble, but solutions are welcome” at some visible place.

Guido, as usual, wrote a well-reasoned post that can be a good example.