Monday, November 15, 2010

Running large radio telescope software on top of PyPy and twisted

Hello.

As some of you already know, I've recently started working on a
very large radio telescope at SKA South Africa. This telescope's
operating software runs almost exclusively on Python (several high throughput
pieces are in C or CUDA or directly executed by FPGAs). Some cool telescope pictures:

(photos courtesy of SKA South Africa)

Most of the operation software is using the KatCP protocol to talk between devices.
The currently used implementation is Open Source software with a custom home built
server and client. As part of the experiments, I've implemented a Twisted based
version and run in on top of CPython and PyPy for both the default
implementation and the one based on Twisted to see how those perform.

There are two testing scenarios: the first one is trying to saturate the connection
by setting up multiple sensors that report state every 10ms, the second one
is measuring a round-trip between sending a request and receiving the response.
Both numbers are measuring the number of requests per 0.2s, so the more the better. On X axis there is a number of simultanously connected clients.

All benchmark code is available in the KatCP repository.

The results are as follows:

As you can see, in general Twisted has larger overhead for a single client
and scales better as the number of clients increases. That's I think expected,
since Twisted has extra layers of indirection. The round trip degradation of
Twisted has to be investigated, but for us scenario1 is by far more important.

All across the board PyPy performs much better than CPython for both
Twisted and a home-made solution, which I think is a pretty good result.

Note: we didn't roll this set up into production yet, but there are high
chances for both twisted and PyPy to be used in some near future.

Cheers,
fijal

Hello.

As some of you already know, I've recently started working on a
very large radio telescope at SKA South Africa. This telescope's
operating software runs almost exclusively on Python (several high throughput
pieces are in C or CUDA or directly executed by FPGAs). Some cool telescope pictures:

(photos courtesy of SKA South Africa)

Most of the operation software is using the KatCP protocol to talk between devices.
The currently used implementation is Open Source software with a custom home built
server and client. As part of the experiments, I've implemented a Twisted based
version and run in on top of CPython and PyPy for both the default
implementation and the one based on Twisted to see how those perform.

There are two testing scenarios: the first one is trying to saturate the connection
by setting up multiple sensors that report state every 10ms, the second one
is measuring a round-trip between sending a request and receiving the response.
Both numbers are measuring the number of requests per 0.2s, so the more the better. On X axis there is a number of simultanously connected clients.

All benchmark code is available in the KatCP repository.

The results are as follows:

As you can see, in general Twisted has larger overhead for a single client
and scales better as the number of clients increases. That's I think expected,
since Twisted has extra layers of indirection. The round trip degradation of
Twisted has to be investigated, but for us scenario1 is by far more important.

All across the board PyPy performs much better than CPython for both
Twisted and a home-made solution, which I think is a pretty good result.

Note: we didn't roll this set up into production yet, but there are high
chances for both twisted and PyPy to be used in some near future.

You say that there you are mostly using Python and sometimes C, CUDA or FPGAs.I am writing my master thesis in the Netherlands, it is about the efficient implementation of a beam forming algorithm (the one used by the LOFAR) on modern GPUs using CUDA and OpenCL. Do you have some papers or other material there about the telescope software ? I would be really interested on citing it on the related works part.

I have a program using Python and Twisted where I load tested both server and client connections (the program can do both the server and client protocol). I tested both types out to 100 connections (at 50 milli-second polling intervals) while measuring CPU load.

What I found was that when acting as a server it scaled fairly linearly. When acting as the client side however, load rose to a peak about 60 clients, then fell by a third until 80 clients, and then rose again until at 100 clients it reached the same load level as at 60. If you have a similar situation you may need to watch out for this phenomenon.

I also found that using the epoll reactor on Linux made a *big* difference to capacity in my applications, much more so than any normal program optimization efforts that I made. I have multiple clients and multiple server ports all running simultaneously, so I'm not sure how this may translate to your application if you are only using Twisted as a server.

Here's a link to my project web site where I show the connections versus CPU load chart (first chart):

http://mblogic.sourceforge.net/mblogichelp/general/Capacity-en.html

I haven't tested this with PyPy as I don't have a combination of anything that is both 32-bit *and* new enough to run a recent version.

I also made the previous anonymous post on the 21st. I haven't been able to get the 64 bit JIT version to run or build. That may be my fault, but I haven't been able to test it (this isn't a problem that I want to waste your time on however).

I have tested the non-JIT Pypy using a simplified version of my server and client programs, using asyncore instead of Twisted. The server and client use a standard industrial automation protocol to talk to each other over a TCP socket. The programs also make heavy use of list slicing and struct.

The non-JIT version passes all the tests I have for the server, and runs my application performance test at roughly 1/3 the speed of CPython 2.6. This is very impressive, as I have never been able to get either IronPython (on Mono) nor Jython to even run the programs, let alone pass my functional tests. The fact that Pypy (non-JIT) can run these programs perfectly without changes is something that I find very promising.

Please continue the good work, and thank you for what you've done so far!

I would be very interested if someone could provide some info on how to get twisted working on pyp. I have managed to install twisted in the pypy setup but starting it produces: AttributeError: 'module' object has no attribute 'load_dynamic'coming from zope