High Level Concurrency with JRuby and Akka Actors

Many developers are used to low-level concurrency primitives, such as locks, monitors and semaphores. Java also has higher level concurrency utilities such as Atomic Objects and Fork/Join framework. Such primitives still require a lot of attention to shared variables, and are very easy to get wrong. Ilya Grigorik recently discussed other models of concurrency on his recent post Concurrency with Actors, Goroutines & Ruby, where he even introduced a ruby port of Go‘s concurrency mechanism. These models attempt to make concurrent programming easier.

The actor model is another very simple high level concurrency model: actors can’t respond to more than one message at a time (messages are queued into mailboxes) and can only communicate by sending messages, not sharing variables. As long as the messages are immutable data structures (which is always true in Erlang, but has to be a convention in languages without means of ensuring this property), everything is thread-safe, without need for any other mechanism. This is very similar to request cycle found in web development MVC frameworks.

Scala is famous for coming with an Actor library built-in. However, using Scala libraries in Ruby is not easy[1]. Akka is another great project that implements Actors, however it has a java Api, which makes the JRuby integration easier. Why JRuby? Not only to access Akka’s actor library, but also because JRuby is one of the few ruby implementations that doesn’t have the GIL, therefore it allows true concurrency for all types of applications (IO bounded or not).

This simple actor will just output any message it receives prefixed with “!!! Acted on: “, and will throw exception on any message that is not a string. This example show how simple it is to define an actor: just define a onReceive method that is called whenever a message is sent.

The first gets an actor reference and starts it. It is important to note that you cannot create an actor just by invoking new. Not in java or scala (we can solve this in Ruby). This is because there is a lot of AOP going on the background[2]. The second line just sends the message to the actor asynchronously. There two other ways of sending messages, which I’ll not cover, but you can read more in Akka’s documentation.

The last two lines just give time to the message reach the actor (remember, the sendOneWay method is non-blocking), and stops all actors on the system. Pretty simple right? Let’s see how we can do the same in JRuby. Setting up the stage:

Here we have our first differences. The onReceive is just a cleaner version of the Java one. No type annotations, no type checking and a simpler string output. However, we have to define a classmethod called create, which just invokes new. This method seems to be created by the AOP part of Akka, which doesn’t seem to work on Ruby subclasses of UntypedActor. However, we can defined it ourselves. Now to actually using the actor:

Pretty much the same four lines as on the Java version, with a little less parenthesis, and a terser sleep method. The ruby code can be found on this page, and Java code here.

Fixing the Ruby Interface

Much of the code in the former example is infrastructure, but we can work around the static nature of Java classes in ruby. As factored out in akka.rb, we can gather this functionalities into a base class, and rewrite the first example in 8 lines:

But every Object is an Actor!

I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea. The big idea is “messaging”.

This is one of the reasons that Erlang with its actors form an object oriented language[3]

If we look into the resemblance of sendOneWay and the reflective method invocation, which in Ruby is made through send or __send__, it is quite easy to adapt the ruby method invocation to Actor message sending. We start with a simple delegator:

The important part is the onRecieve message, which takes a MethodParameters object and invoke on the target. The caveat here is that we need to override the new method, because, for some reason, ruby subclasses of Akka UntypedActors will not invoke the initialize method with the arguments passed[4]. However, by turning any object into an actor, this can be the only place such hack is needed.

Which is a pretty standard implementation of message forwarding in ruby: remove all instance methods (except the really private ones, such as __send__), making sure all method calls are forwarded to method_missing.

Making it faster

In the heart of all of this lies the problem: making code runs faster by using the machine’s cores more effectively. Here we build the good old canonical map-reduce example: word count. We will count 5.4 MB of Shakespeare‘s texts. The example consists of 3 types of actors: a producer, mappers, and one reducer. The producer generates the chunks of lines to the mappers, which count the words on each chunk and generate a hash of word:count pairs, which the reducer aggregates into a hash of its own.

This code setups up the code to later define WorkStealer, so that our map actors can share the same message queue. We also load the file in memory, split into 500 lines chunks. If the chunks are too small, the mappers will receive too many messages, which makes the code go slow. If the chunks are too big, the job will not be split evenly among the mappers[5].

These lines define the producer, start all actors, set the start time, and send the producer actor a message, which begins the map-reduce chain. It is important to note that the program’s main thread finishes on the last line. This example shows how It is possible to make it wait for the result and then resume the main thread (it requires using the other types of message sending methods, thus I’ll not cover it in detail).

Results: The sequential version runs on my machine (which has 2 cores) in about 4 seconds. This one with map-reduce actors take about 3 seconds, which yields a 25% improvement[6].

Conclusion

This post showed how it is easy to use Akka actors with JRuby and that they can easily enable thread-safe and easy to reason multicore programming. The Akka project has many other tools to help with distributed/parallel programming, such as remote actors, software transactional memory, and integrations with all sorts of persistence/queue systems. This post barely scratches the surface.

All the code on this blog post can be found on github, where all dependencies are easily available, and instructions on how to easily run the code. Give it a try, and see if you agree (or not) with others that writing parallel code can be much easier and fun.

Actually it’s a kind of 180 degree turn because I wrote a blog article that said “Why object-oriented programming is silly” or “Why it sucks”. I wrote that years ago and I sort of believed that for years. Then, my thesis supervisor, Seif Haridi, stopped me one day and he said “You’re wrong! Erlang is object oriented!”

This question is very application specific. It might be much more suitable to use Clojure or Scala, but it might not. Depends on what you are doing. And JRuby JITs to Java (bytecode actually), which JITs to machine code itself, so I’d do some testing before making any decision. The good part of actors is that they are very loose coupled, so you can mix and match different actors in different languages, only optimizing those that need it.

Haven’t had the time to do some tests concerning the performance of JRuby. Even with JRuby JIT it seemed sooo much slower than statically typed languages on the JVM (the benchmarks i found where 2 years old though; got any newer?). Dynamic invocation might change that picture though (https://twitter.com/#!/headius/statuses/63612212761198592), but there’s still the question wether it makes sense to use Ruby on JRuby when writing a messaging backend trimmed for speed. For wiring up the view it would make perfect sense though. Thanks again for your article!