as usual, wait a long time (and be sure you have more than 1GB of RAM).

For now pypy-c-jit spews a lot of debugging output and
there are a few known
examples where it crashes. As we like to repeat, however, it's a complete JIT:
apart from the crashes (the bugs are probably in the JIT support code), it supports the whole Python language from the start -- in the sense of doing correct things. Future work include
Python-specific improvements by e.g. tweaking the data structures used to store Python objects so that they are more JIT-friendly.

EDIT: Oh yes, fijal reminds me that CPython 2.6 is 30% faster than CPython 2.5 on this benchmark (which is mostly my "fault", as I extracted a small part of PyPy and submitted it as a patch to CPython that works particularly well for examples like richards). It does not fundamentally change the fact that we are way faster though.

Hi all,

Just a quick note to tell you that we are progressing on the
JIT front. Here are the running times of the richards
benchmark on my laptop:

8.18 seconds with CPython 2.5.2;

2.61 seconds with pypy-c-jit (3x faster than CPython);

1.04 seconds if you ignore the time spent making assembler (8x faster than CPython);

1.59 seconds on Psyco, for reference (5x faster that CPython).

Yes, as this table shows, we are spending 1.57 seconds in the JIT
support code. That's too much -- even ridiculously so -- for anything but a
long-running process. We are working on that :-)

as usual, wait a long time (and be sure you have more than 1GB of RAM).

For now pypy-c-jit spews a lot of debugging output and
there are a few known
examples where it crashes. As we like to repeat, however, it's a complete JIT:
apart from the crashes (the bugs are probably in the JIT support code), it supports the whole Python language from the start -- in the sense of doing correct things. Future work include
Python-specific improvements by e.g. tweaking the data structures used to store Python objects so that they are more JIT-friendly.

EDIT: Oh yes, fijal reminds me that CPython 2.6 is 30% faster than CPython 2.5 on this benchmark (which is mostly my "fault", as I extracted a small part of PyPy and submitted it as a patch to CPython that works particularly well for examples like richards). It does not fundamentally change the fact that we are way faster though.

41 comments:

At this point, it would be a really good idea for the pypy team to prepare downloadable binaries or setup tools, or eggs for making it extremely easy for a new user to try it out. Now that the performance is starting to become interesting, many more people will want to experiment with it and you don't want that enthusiam hampered by a somewhat involved setup process.

This particular benchmark happens to be the one we use; there is no deep reason besides its relative simplicity (but it is not a microbenchmark, it's really computing something). We will of course make more tests with a better range of benchmarks when things start to settle a bit. Right now we are busy developing, and the numbers change every week.

It's also for this reason that there is no nicely-packaged release, sorry :-)Note that translating your own pypy-c-jit is not a lot of work for you. It is just a lot of work for your CPU :-) You just do "svn co", "cd pypy/translator/goal" and "./translate -Ojit".

Perhaps there is some way to store generated assembler code? I don't know too much about assembler or the JIT backend, but I assume that it'd be possible to stick the generated assembler code into a comment (or, if those don't exist, a docstring) in the .pyc file, so that a library that is commonly imported won't have to waste time generating assembler.

@della: I use a 32-bit chroot on my own x64 machine. I don't know if that counts as "too much effort" (certainly it theoretically shouldn't require that), but it has been for me the most painless way to do it.

Here are prebuilt C sources (in which "tracing" time was reduced by 20-30% since the blog post):

http://wyvern.cs.uni-duesseldorf.de/~arigo/chain.tar.bz2

Linux x86-32 only. You still need a svn checkout of PyPy, and you still need to compile them with gcc -- but it does not take too long: edit the first entry of the Makefile to point to your checkout of PyPy and type "make". This still assumes that all dependencies have been installed first. Don't complain if the #includes are at the wrong location for your distribution; you would get them right if you translated the whole thing yourself. In fact, don't complain if it does not compile for any reason, please :-) C sources like that are not really supposed to be portable, because they are just intermediates in the translation process.

̉You are probably missing a dependency. See http://codespeak.net/pypy/dist/pypy/doc/getting-started-python.html#translating-the-pypy-python-interpreter

Dear Armin, it seem like this document should mention libexpat1-dev and libssl-dev as dependencies, too. Anyway, I managed to build pypy-c, and here are the result for some small benchmarks I wrote. (Is there a way here at blogger.com to not break the formatting?)

Thanks for the missing dependencies; added to the development version of the page. Thanks also for the numbers you report. The next obvious thing we miss is float support (coming soon!), which shows in some of your benchmarks.

Dear Pypy developers, is it possible to switch off the very agressive JIT logging in pypy-c? First, this could make pypy-c a drop-in replacement for cpython. (Many more beta-testers.) Second, the logging itself seems to be somewhat resource-intensive.

Thanks for the report, della. Fixed, if you want to try again. Parsing gcc output is a bit delicate as the exact set of operations used depends on the specific version and command-line options passed to gcc.

Since the blog post, here are the updated numbers: we run richards.py in 2.10 seconds (almost 4x faster than CPython), and only spend 0.916 seconds actually running the assembler (almost 9x faster than CPython).

I was having the same problem as della, and your fix seems to work, but it's breaking somewhere else now. I don't think I have a dependency problem, I can build a working pypy-c without jit. Running make manually produces heaps of warnings about incompatible pointers, some probably harmless (int* vs long int*, these should be the same on x86-32), but others worry me more, like struct stat* vs struct stat64*, or struct SSL* vs char**. I put the complete output of a manual run of make online.

First off - congratulations and good job on the great progress. I've been watching this project since the 2007 PyCon in DFW and it's great to see these promising results.

That said, while I know there's still a lot of work to do and this is very much an in-progress thing, I'm very much looking forward to an excuse to try this stuff out in anger - real practical situations. For me that means some statistical calculation engines (monto-carlo analysis) front ended by web services. In both situations this brings up two constraints: a) must support 64bit (because our data sets rapidly go above 4GB RAM) and b) must not be overly memory hungry (because any significant incremental overhead really hurts when your data sets are already over 4GB RAM).

For now we use Psyco for small stuff but have to re-implement in C++ once we hit that 32-bit limit. PyPy is very exciting as a practical alternative to Psyco because of anticipated 64bit support. I wonder if, due to the existence fo Psyco already, that PyPy shouldn't focus first on 64bit instead?

Few things would speed up progress than getting PyPy used out in the wild - even if only by those of us who appreciate it's very much in flux but still understand how to benefit from it.

I understand you guys have your focus and goals and encourage you to keep up the good work. Just thought I'd throw this out as an idea to consider. I'm sure there are a lot like me anxious to give it a spin.

Andrew: can you update and try again? If you still have the .c files around it is enough to go there and type "make"; otherwise, restart the build. It should still crash, but give us more information about why it does.

There isn't a bug in the -Ojit translation process, I was just missing a dependency that I could've sworn I've installed before.

The translation process only takes < 1GB memory if done without any options. Attempting to translate with the -Ojit option takes at least 2.5GB of RAM, as I tried last night (with it as the only running process) and it consumed my swapfile and ran out of memory.

Is there any documented way to use a translated pypy binary to build other pypy translations? That might help reduce the build requirements, and would also be mighty cool.

Michael: It is written in RPython (a subset of Python) but then translated to C. By convention we therefore call the executable-name pypy-c. If the executable also contains a JIT, we call it pypy-c-jit.

Ben Scherrey: 64bit support might happen not too far in the future. Not using too much memory is a different problem, that might take a while longer. It has two aspects, one is that the JIT itself uses way too much memory at the moment. We will work on that soon.

The other aspect is making sure that your dataset does not take too much heap. It depends a bit which data structures you use, but it's not likely to be that great right now. That might change at some point, I have some ideas in that direction, but not really time to work on the soon.