In this post I will look synchronous vs asynchronous programming with Ruby’s EventMachine, to show that asynchronous does not always mean that your code will run faster.

In part 1 of this series on Ruby’s EventMachine I discussed the benefits of event-based programming in general. I am a big fan of event-based programming, as you will see in these posts, but I wanted to flip the coin over and look at one of the down-sides of event-based programming.

The Cost

Managing events does not come for free. There is overhead of wrapping code in callbacks, stashing context, queuing events, deleting events, managing timer events and communication the the operating system.

Exhibit A

With the example I am using for this post, we talk over a TCP network connection and perform a high number of transactions in a short period of time. If you read part 1, then you will know that this sounds like an ideal use-case for EventMachine to really shine.

The example I am going to use is Memcached. Memcached is fast. If you have a low-latency network connection to Memached, then it is really fast.

Memcached, as its name implies, runs in memory, so the only thing that is going to slow it down is it being overwhelmed with network requests or some inefficiency in its algorithms (which are CPU bound). Personally, I have never hit the upper-bound of either of these, as there is always something else in the my architecture which croaks first.

The Test

I wrote a test to see if asynchronous memcached communication, using EventMachine, or synchronous memcached communication, using the memcached gem, would be faster.

The Results

This report shows the user CPU time, system CPU time, the sum of the user and system CPU times, and the elapsed real time. The unit of time is seconds. (from Benchmark docs)

So we can see from the above that EventMachine-based version took about twice as long to run, took 8 times as much user CPU time and over 4 times as much system CPU time. That is quite significant.

Avoid EventMachine With Memcached?

This test needs to be put in context. The test was being run on same machine as Memcached, so the network latency was extremely low. EventMachine was not being used for anything else and this script had no other tasks to perform in-between sending requests to Memcached and receiving the responses, so blocking was not an issue.

I could benchmark this and conclude that synchronous Memcached usage was the way to go. I would then roll it out to production, where Memcached is running on a different machine in a different data-center (please do not do this), and the latency would kill this synchronous script. Where you have latency and many requests, asynchronous event-based programming is usually going to win.

Therefore, if the context is such that this kind of synchronous model works better for you, performance is important and you can be sure that things are not going to change, then maybe it is worth consideration.

Suck It Up

Nearly everything application I write now is event-based. I use EventMachine in Ruby and Tornado’s io_loop in Python. I write high performance code and do everything non-blocking, because I do not want anything to halt my event-loop, ever.

I will gladly take a little overhead and fire up a new process or a new machine, if necessary, if it means that an external service like Memcached will not bring my event-loop to a halt when it has issues. It may be fast now, but one network glitch or Memcached crash may render my event-loop defunct. I might go from processing 10k requests per second to processing 1 per second, if I have a timeout of 1 second on one blocking network connection. So, yes, I will gladly suck up this asynchronous overhead in the short-term to protect from [expected] unexpected issues in the future.

Can EventMachine Be Faster?

I am a believer that anything can be better, faster, stronger. EventMachine is already heavily written in C++, which itself is a clear sign that its operation is CPU-bound.

There is a pure ruby implementation of EventMachine. You can play with this to compare the performance of the C++ implementation. In basic tests, you unlikely see a difference. The general benefit you get with an event-based system, when dealing with latency in disk and network I/O, far out-weighs the overhead of the event-system. It is only when you start to hit it extremely hard you will see the differences.

A faster EventMachine would be great, but it will make little difference to you when comparing with asynchronous code. You can never escape the overhead that asynchronous code adds. Therefore, synchronous code will continue to much look much faster in examples like the one above.

Event-based programming enables your application to utilize 100% of the cpu, because anything not cpu-bound can be passed off to the operating system. Therefore, if our code-base is truly event-based, we would only see the benefit of a faster EventMachine once we hit 100% utilization of the CPU.

Side note: I have hit some bugs when using the pure-ruby implementation in my code and the EventMachine test-suite was not passing for me when trying to use pure-ruby. The test-suite is now passing with the latest HEAD of the git repository, so these might have been a temporary issues, but it highlighted to me the order of priority for C++ vs pure-ruby implementations.

It’s a very good example that EM is still far from being perfect.
This is mostly due to the poor memcached protocol implementation, which adds all these callbacks to an array and is poping it one by one.

There’s a problem in test itself, you should consider that in evented style you never know which operation finishes first, and you cannot be sure that DEL is sent after SET and GET. So this should be better: