Friday, November 26, 2010

PyPy 1.4: Ouroboros in practice

We're pleased to announce the 1.4 release of PyPy. This is a major breakthrough
in our long journey, as PyPy 1.4 is the first PyPy release that can translate
itself faster than CPython. Starting today, we are using PyPy more for
our every-day development. So may you :) You can download it here:

What is PyPy

PyPy is a very compliant Python interpreter, almost a drop-in replacement
for CPython. It is fast (pypy 1.4 and cpython 2.6 comparison).

New Features

Among its new features, this release includes numerous performance improvements
(which made fast self-hosting possible), a 64-bit JIT backend, as well
as serious stabilization. As of now, we can consider the 32-bit and 64-bit
linux versions of PyPy stable enough to run in production.

More highlights

PyPy's built-in Just-in-Time compiler is fully transparent and
automatically generated; it now also has very reasonable memory
requirements. The total memory used by a very complex and
long-running process (translating PyPy itself) is within 1.5x to
at most 2x the memory needed by CPython, for a speed-up of 2x.

More compact instances. All instances are as compact as if
they had __slots__. This can give programs a big gain in
memory. (In the example of translation above, we already have
carefully placed __slots__, so there is no extra win.)

We're pleased to announce the 1.4 release of PyPy. This is a major breakthrough
in our long journey, as PyPy 1.4 is the first PyPy release that can translate
itself faster than CPython. Starting today, we are using PyPy more for
our every-day development. So may you :) You can download it here:

What is PyPy

PyPy is a very compliant Python interpreter, almost a drop-in replacement
for CPython. It is fast (pypy 1.4 and cpython 2.6 comparison).

New Features

Among its new features, this release includes numerous performance improvements
(which made fast self-hosting possible), a 64-bit JIT backend, as well
as serious stabilization. As of now, we can consider the 32-bit and 64-bit
linux versions of PyPy stable enough to run in production.

More highlights

PyPy's built-in Just-in-Time compiler is fully transparent and
automatically generated; it now also has very reasonable memory
requirements. The total memory used by a very complex and
long-running process (translating PyPy itself) is within 1.5x to
at most 2x the memory needed by CPython, for a speed-up of 2x.

More compact instances. All instances are as compact as if
they had __slots__. This can give programs a big gain in
memory. (In the example of translation above, we already have
carefully placed __slots__, so there is no extra win.)

Is there a -j <number-of-cores> option for the translation process? It's a bit unfortunate that 15 cores on the nice machine I'm using can't be put to use making it translate faster. (Or unfortunate that I didn't read the documentation, maybe.)

@Anonymous5x improvement is not a well defined goal, however it's a good marketing thing. PyPy is 2x faster on translation, 60x faster on some benchmarks while slower on other. What does it mean to be 5x faster?

This is awesome. PyPy 1.4 addresses the 2 slowest benchmarks, slowspitfire and spambayes. There is no benchmark anymore where PyPy is much slower than CPython.

To me, this marks the first time you can say that PyPy is ready for general "consumption". Congratulations!

PS: The best comparison to appreciate how much of an improvement 1.4 has been is:http://speed.pypy.org/comparison/?exe=2%2B35,1%2B41,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=2%2B35&chart=normal+bars

@maciej: in an old thread (have tracing compilers won?) you replied to Mike Pall saying that pypy was in a way middle ground, that it didn't offer as much opportunities for micro optimizations as luajit.

You were discussing about keeping high level constructions from the user program to perform more tricks.

Has the situation changed?Do you really think now that you'll get there?

Anyway, LuaJIT has more options for microoptimziations simply because Lua is a simpler language. That doesn't actually make it impossible for PyPy, it simply make it harder and taking more time (but it's still possible). I still think we can get (but predicting future is hard) where LuaJIT is right now, but racing Mike would be a challenge that we might loose ;-)

That said, even in simple loops there are obvious optimizations to be performed, so we're far from being done. We're going there, but it's taking time ;-)

Congrats to all PyPy developers for making huge contributions to Python performance, JIT and implementation research and delivering an end product that will help many developers to get more done.

IIUC, we still have ARM, jit-unroll-loops, more memory improvements, Python 2.7 (Fast Forward branch) and a bunch of other cool improvements in the works, besides some known interesting targets that will eventually be tackled (e.g. JITted stackless).

I wish more big Python apps and developers would play with PyPy and report the results.

Congratulations.However, you suggest people used it in production environment - please, give us version compatible at least with CPython 2.6.I hope that you plan it but at first you wanted to have stable and fast base. :)