Thursday, December 22, 2011

PyCon 2012 is coming up in just a few short months, and PyPy will be well
represented there. We'll be delivering a tutorial, two talks, plus we'll be
around for the sprints.

Here are the abstracts for the tutorials and talks:

How to get the most out of your PyPy, by Maciej Fijalkowski, Alex Gaynor
and Armin Rigo: For many applications PyPy can provide performance benefits
right out of the box. However, little details can push your application to
perform much better. In this tutorial we'll give you insights on how to push
PyPy to its limits. We'll focus on understanding the performance
characteristics of PyPy, and learning the analysis tools in order to maximize
your applications' performance. This is the tutorial.

Why PyPy by example, by Maciej Fijalkowski, Alex Gaynor and Armin Rigo:
One of the goals of PyPy is to make existing Python code faster; however an
even broader goal was to make it possible to write things in Python that
previously would needed to be written in C or other low-level language. This
talk will show examples of this, and describe how they represent the
tremendous progress PyPy has made, and what it means for people looking at
using PyPy.

How the PyPy JIT works, by Benjamin Peterson: The Python community is
abuzz about the major speed gains PyPy can offer for pure Python code. But how
does the PyPy JIT actually work? This talk will discuss how the PyPy JIT is
implemented. It will include descriptions of the tracing, optimization, and
assembly generation phases. I will demonstrate each step with an example loop.

If you have any questions let us know! We look forward to seeing people at
PyCon and chatting about PyPy and the entire Python ecosystem.

See you there,
Maciej Fijalkowski, Alex Gaynor, Benjamin Peterson, Armin Rigo, and the entire PyPy team

PyCon 2012 is coming up in just a few short months, and PyPy will be well
represented there. We'll be delivering a tutorial, two talks, plus we'll be
around for the sprints.

Here are the abstracts for the tutorials and talks:

How to get the most out of your PyPy, by Maciej Fijalkowski, Alex Gaynor
and Armin Rigo: For many applications PyPy can provide performance benefits
right out of the box. However, little details can push your application to
perform much better. In this tutorial we'll give you insights on how to push
PyPy to its limits. We'll focus on understanding the performance
characteristics of PyPy, and learning the analysis tools in order to maximize
your applications' performance. This is the tutorial.

Why PyPy by example, by Maciej Fijalkowski, Alex Gaynor and Armin Rigo:
One of the goals of PyPy is to make existing Python code faster; however an
even broader goal was to make it possible to write things in Python that
previously would needed to be written in C or other low-level language. This
talk will show examples of this, and describe how they represent the
tremendous progress PyPy has made, and what it means for people looking at
using PyPy.

How the PyPy JIT works, by Benjamin Peterson: The Python community is
abuzz about the major speed gains PyPy can offer for pure Python code. But how
does the PyPy JIT actually work? This talk will discuss how the PyPy JIT is
implemented. It will include descriptions of the tracing, optimization, and
assembly generation phases. I will demonstrate each step with an example loop.

If you have any questions let us know! We look forward to seeing people at
PyCon and chatting about PyPy and the entire Python ecosystem.

See you there,
Maciej Fijalkowski, Alex Gaynor, Benjamin Peterson, Armin Rigo, and the entire PyPy team

Thursday, December 8, 2011

Big fat warning This is just a proof of concept. It barely works. There are
missing pieces left and right, which were replaced with hacks so I can get this
to run and prove it's possible. Don't try this at home, especially your home.
You have been warned.

There has been a lot of talking about PyPy not integrating well with the
current scientific Python ecosystem, and numpypy (a NumPy reimplementation
on top of pypy) was dubbed "a fancy array library". I'm going to show that
integration with this ecosystem is possible with our design.

You need a PyPy without cpyext, I did not find a linker that would support
overriding symbols. Right now there are no nightlies like this, so you have
to compile it yourself, like:

./translate.py -Ojit targetpypystandalone.py --withoutmod-cpyext

That would give you a PyPy that's unable to load some libraries like PIL, but
perfectly working otherwise.

Speaking of which, you need a reasonably recent PyPy.

The approach is generally portable, however the implementation has been
tested only on 64bit linux. Few tweaks might be required.

You need to install python2.6, the python2.6 development headers, and have
numpy and matplotlib installed on that python.

You need a checkout of my hacks directory and put embedded on your
PYTHONPATH, your pypy checkout also has to be on the PYTHONPATH.

Er wait, what happened?

What didn't happen is we did not reimplement matplotlib on top of PyPy. What
did happen is we embed CPython inside of PyPy using ctypes. We instantiate it.
and follow the embedding tutorial for CPython. Since numpy arrays are not
movable, we're able to pass around an integer that's represents the memory
address of the array data and reconstruct it in the embedded interpreter. Hence
with a relatively little effort we managed to reuse the same array data on both
sides to plot at array. Easy, no?

This approach can be extended to support anything that's not too tied with
python objects. SciPy and matplotlib both fall into the same category
but probably the same strategy can be applied to anything, like GTK or QT.
It's just a matter of extending a hack into a working library.

To summarize, while we're busy making numpypy better and faster, it seems
that all external libraries on the C side can be done using an embedded Python
interpreter with relatively little effort. To get to that point, I spent
a day and a half to learn how to embed CPython, with very little prior
experience in the CPython APIs. Of course you should still keep as much as
possible in PyPy to make it nice and fast :)

Cheers,
fijal

Big fat warning This is just a proof of concept. It barely works. There are
missing pieces left and right, which were replaced with hacks so I can get this
to run and prove it's possible. Don't try this at home, especially your home.
You have been warned.

There has been a lot of talking about PyPy not integrating well with the
current scientific Python ecosystem, and numpypy (a NumPy reimplementation
on top of pypy) was dubbed "a fancy array library". I'm going to show that
integration with this ecosystem is possible with our design.

You need a PyPy without cpyext, I did not find a linker that would support
overriding symbols. Right now there are no nightlies like this, so you have
to compile it yourself, like:

./translate.py -Ojit targetpypystandalone.py --withoutmod-cpyext

That would give you a PyPy that's unable to load some libraries like PIL, but
perfectly working otherwise.

Speaking of which, you need a reasonably recent PyPy.

The approach is generally portable, however the implementation has been
tested only on 64bit linux. Few tweaks might be required.

You need to install python2.6, the python2.6 development headers, and have
numpy and matplotlib installed on that python.

You need a checkout of my hacks directory and put embedded on your
PYTHONPATH, your pypy checkout also has to be on the PYTHONPATH.

Er wait, what happened?

What didn't happen is we did not reimplement matplotlib on top of PyPy. What
did happen is we embed CPython inside of PyPy using ctypes. We instantiate it.
and follow the embedding tutorial for CPython. Since numpy arrays are not
movable, we're able to pass around an integer that's represents the memory
address of the array data and reconstruct it in the embedded interpreter. Hence
with a relatively little effort we managed to reuse the same array data on both
sides to plot at array. Easy, no?

This approach can be extended to support anything that's not too tied with
python objects. SciPy and matplotlib both fall into the same category
but probably the same strategy can be applied to anything, like GTK or QT.
It's just a matter of extending a hack into a working library.

To summarize, while we're busy making numpypy better and faster, it seems
that all external libraries on the C side can be done using an embedded Python
interpreter with relatively little effort. To get to that point, I spent
a day and a half to learn how to embed CPython, with very little prior
experience in the CPython APIs. Of course you should still keep as much as
possible in PyPy to make it nice and fast :)