Is Erlang actually fast? Why not just stick with C?

1082009

What’s the point of learning Erlang when I already know {your language here}? It will take a long time to learn the language and even longer to become good with it… I don’t want to learn a whole new library… Functional programming is hard… My text editor is super awesome at {your language here} and I don’t want to use new tools…

Would you like some cheese with that whine?

I’m going to show you a few ways Erlang excels, with a special focus on concurrent applications. I’ll even include some benchmarks to prove it. So sit down, be quiet, and give Erlang a chance!

As I mentioned a few posts ago, Erlang has many features that make it suitable to high concurrency. For large concurrent systems to be effective, a few things need to happen:

Fast thread/process spawning

Quick and efficient communication between threads and/or processes

No deadlocks or race conditions

Minimization of shared resources

Luckily for you, Erlang can do all of this easily!

Fast Thread/Process Spawning

When working in Erlang, you don’t use threads, but instead you use Erlangprocesses. However, a distinction needs to be made. An Erlang process is not an operating system process, but rather is managed by the Erlang virtual machine.This means that these Erlang processes are independent of the underlying operating system. Because of this, a lot of overhead is removed from the operating system. The process itself is also much simpler since it doesn’t have to deal with the OS, but rather just the Erlang VM.

Spawning a process in Erlang is simple. Just do:

Pid = spawn(module, function, [arguments]).

That’s pretty simple right? Compare that to C’s CreateThread for Windows or pthread_create for Unix platforms. In addition to the complicated function calls, you need to include headers and pass a special argument to the compiler. *Blech* Erlang is much easier to deal with.

Code clarity is one thing, but that’s not really important if its 10 times slower. Well, I prepared a little demo comparing Erlang and C. I’m going to use pthread_create to make threads rather than processes. Threads are lighter weight than processes, so this is actually better for C. You can find the demo files here.

The general point of this demo was that each program spawns 10,000 Erlang processes or 10,000 pthreads. Each process/thread then prints out “I’m a thread!” (It should really print ‘I’m an Erlang process’ for the Erlang test, but the extra characters would slow down the program, making the comparison unfair.) I recorded how much time it took to do this in C and in Erlang.

You can view the files that I used in the demo and the results I got here. The bottom line is this:

C: 4.91 seconds

Erlang: 2.07 seconds

Erlang is more than twice as fast! Wow! So in the category of creating threads, Erlang takes the cake.

Creating threads isn’t the only metric that needs to be considered though∂ß. We also need to take look at how they perform. That’s what we’ll tackle next.

Message Passing

Threads are good if they can operate completely independently of each other, but what if you need them to communicate? As an example, let’s say you have a server with a front-end thread that receives data. This thread does a little processing and then passes the data along to another thread, based on the type of data. The data itself can be stored in a file, but to tell the proper thread to begin processing, we need to alert it by passing it a message.

In Erlang, passing a message is really easy. Are you ready for it?

ProcessID ! Messsage.

Dead simple. Can it really get any easier? But I hear a few of you saying, that’s great, but its a pain in the butt to get a process Id number!

Agreed, so why not assign it to some sort of variable? You can do that by:

register(theMainThread, self()).

From now on, you can send messages to theMainThread, rather than a process ID.

I thought about doing another demonstration comparing runtimes of message passing, but Vijay Kandy has already done a good job here. His results are using Java, a language which also uses a virtual machine. As you can see by his results, Erlang is MUCH faster at passing messages around.

No deadlocks/race condition

Deadlocks and race conditions are a huge source of headaches when making concurrent systems. In Erlang, you almost have to try to put them in your programs though. This is because Erlang programs don’t have a state (through global variables and such). Because there is no state in Erlang programs, global variables aren’t possible, making deadlocks and race conditions almost a non-issue.

Race conditions make rhinos tired

I’m not going to lie though, there are a few cases where deadlocks and race conditions can occur. When you are working with the OS for file operations or I/O devices, these introduce possibilities of a deadlock or race condition, but this can be dealt with pretty easily. A mutex or semaphore is almost trivial to implement and can solve these issues.

Minimization of shared resources

In some programs you’ve written in another language, I’m sure you’ve used global variables or even *gasp* memory-mapped some piece of shared memory. This sure does make it easier to get your program to work, right?

Yes, but at what cost?

For each shared resource, you need a mutex, semaphore, or some other sort of locking mechanism. This introduces a ton of overhead that you don’t want to deal with. Shared memory also introduces the dangers or deadlocks or race conditions, as I mentioned above.

Due to the nature of functional programming, you rarely have any shared resources. This is because, at least in Erlang, variables are immutable. That is, once you assign a value to a variable, you cannot changes it value. (Not really ‘variable’, is it?)

This sounds strange, but it is actually a very powerful concept. Instead of using a shared variable to pass information between threads, you would just send that variable to another process using a message. You could either do this synchronously by having the other thread send the main thread a request or you could just send messages to the other thread at a constant interval.

You can keep track of the data by having a recursion function store the data as its arguments, such as:

This post focused on Erlang’s strengths as a concurrent language. It provides a better environment than Java or C for concurrent applications. Agreed, Erlang does require a different thought pattern and some added effort to learn the language, but the payoff is worth it for the performance payoff.