Pages

Tuesday, September 6, 2016

Solving Multi-Core Python

tl;dr I proposed a solution for Python's weak multi-core story (note I didn't say "support"), but didn't have time to follow through. I still have hope, both for the project as a whole and for several standalone parts.Contents: Backstory The Proposal The Outcome The Details

Backstory

For the longest time I've heard the argument that multi-processing is the Pythonic way to get multi-core. Threads are an anti-pattern, some say. They are a thing only because Microsoft made them a thing, some say. I've heard anecdotal evidence from people that actually doparallel (not just concurrent) computing (e.g. scientific, render farms) that they favor multi-processing approaches, especially because they typically go multi-host anyway.You could also argue that in practice nearly all concurrent programming is IO-bound and that asyncio solves that for us. Recent-ish discussions (e.g. PEP 492) lead me to believe that it's not nearly that simple.Ultimately the merits of multi-processing and asyncio over multi-core threading, regardless of whether or not a valid or sufficient argument, do not mitigate the popular and pervading *sentiment* that Python is weak when it comes to leveraging multi-core and handling concurrency (or more accurately computational parallelism). And perception ""is 9/10th of the law"" (or arguably higher).Folks looking for a solution are going to search for one that matches the model in their brain. Much like with organic molecules, if the conceptual bind points don't line up they aren't going to connect with what Python is offering. The power of Python is that it maps well onto our brains. Though concurrency/parallelism isn't very suited to our brains, it is one key place where Python doesn't do a good job of matching conceptual expectations at large.

The Proposal

In short, Python's multi-core story is murky at best. Not only can we be more clear on the matter, we can improve Python's support. The result of any effort must make multi-core (and concurrency/parallelism) support in Python obvious, unmistakable, and undeniable (and keep it Pythonic).Early in 2015 I'd reached my limits with all the criticisms and misunderstandings. So in the spirit of open source I resolved to do something about it. Since this was not an area of expertise for me I did a lot of reading and reached out to experts I know (thanks to Guido, Nick, Sarah, Graham, and others). In a few months I felt like I had a good enough understanding and a good solution.In June of 2015 I introduced my solution on the python-ideas mailing list. The gist is to use CPython's existing subinterpreters to isolate GIL-free execution threads with a CSP front end. My hope was to finish the first stage of work in time for Python 3.6 (i.e. right about now). The reception was generally positive. There was even discussion on reddit. I was encouraged.

The Outcome

Going in I knew it would be a challenging project. However, the solution I proposed was tractable in the desired time frame, building on a lot of existing parts and decomposing into manageable stages. The blockers were well understood, mainly involving subinterpreter bugs and PEP 432. I also received several solid offers for help. In October I even went to PyCon UK to coordinate some of the efforts.At the same time it became clear that my life was getting too busy to make much progress. In early 2016 I decided to table the project. I wasn't giving up yet and hoped to get back to it. Furthermore, there were also many parts of the project that stand on there own as useful features. That's about where things are at right now.

The Details

This is where I try to summarize my proposal and relevant information.