An Introduction to Erlang

These days, the functional languages are all the rage. You see more and more hackers from the traditionally vanilla languages trying out things like Haskell or Scheme or OCaml. Breaking away from an imperative tradition forces us to think in a different way, which is always a good thing.

Recently, I've heard a lot about Erlang, especially from curious members of the Ruby community. This article is the result of my quick dive into the language, and will hopefully serve as a starting point for anyone else who's been hearing the buzz, but hasn't taken the plunge yet.

For this article, I'm assuming you're familiar with the general ideas behind functional programming, and that you have at least a conceptual grasp of concurrency. I'm also counting on you having at least an intermediate level of experience programming in any language. However, I'm not assuming you know any Erlang, so don't worry if this is the first time you've ever heard of the language.

Why Learn Erlang?

There are a lot of functional languages out there. All of them have their strong points and their foibles. Some seem wholly academic, while others are as pragmatic as the best object oriented languages out there. Choosing which languages to study is really a matter of figuring out what concepts you want to learn.

Erlang is known for an extremely elegant concurrency model. I've always been one to cringe at the mention of things like mutex locks, race conditions, and the entire motley crew of conceptual baggage that typically come along with any sort of parallel programming.

Joe Armstrong claims in "Programming Erlang" that because Erlang is designed from the ground up for concurrency, it makes life a lot easier. This, along with the promise of a small, efficient, and well thought out language implementation was enough to get me interested.

We'll start by going through the nuts and bolts of the language, and eventually ramp up to a simple concurrent program in Erlang that implements a basic chat system. Though it's far from fancy, it will show you how spinning off a few processes and getting them to communicate is almost trivial in Erlang.

Hello Erlang

It seems like screen I/O is something that often comes in the middle of functional programming texts, and though that's not necessarily a bad thing, a lot of us are used to the instant gratification of a "Hello World" program.

It's really nothing fancy, and if you run it as an Escript, it's quite simple:

#!/usr/bin/env escript
main(_) ->
io:format("Hello World\n").

From the *nix shell:

$ chmod +x hello
$ ./hello
Hello World
$

What you're seeing there is a function definition for main(), which gets executed by our script. The io:format() is what does our screen output.

Of course, none of this is exciting. Let's take a look at a more functional friendly primer, which is defining factorial:

example.erl

This code represents a trivial Erlang module, which gives functions a namespace to live in. From this example, you can see that Erlang allows for piece-wise function definition, which makes the syntax quite pleasing to the eye for most folks who have an interest in mathematics.

To run this code, we'll make use of the Erlang shell, compiling the module and then running some basic tests:

What we've done here is actually compile our module and then run some examples, live in the shell, against the compiled code. Being able to easily build your modules from within the Erlang shell is very nice, it makes you almost forget that you're working with a compiled language.

When we call c(example), Erlang looks for a file called example.erl in the load path, which includes the current directory. When it finds it, it populates the object code in a file called example.beam. You can then use the code freely from the shell, and even can recompile when you make changes.

In this example you can see we call example:fact(n) and get the proper results back. Seeing how the function is invoked, it's easier to explain what the export() line in our file means:

-export([fact/1])

Here we are telling the compiler that we want the function fact() with one argument to be callable from outside the module definition. Without this line, we would not be able to call the function from the shell, because it would be private to the module.

In Erlang, a function definition with a different number of arguments is actually an entirely separate function, which is why we need to specify the number of arguments when exporting.

For example, this function shares nothing in common but the same name as our original fact definition:

Of course, we're starting to get ahead of ourselves. Let's now take a look at some of the common language elements and control structures in Erlang, so that we can begin to do something useful with them.

Elementary Erlang

Though we've looked at a couple trivial examples of Erlang code and played around with the shell a bit, we should really go over the core set of features you need to know about when working with any programming language. Erlang definitely does some things differently, so it's worth mentioning some of the surprises, too.

Erlang Variables Are Not-So-Variable

Once you bind a value to a variable in Erlang, it cannot be changed.

In practice, this is what that means:

X = 3.

When you execute this code in Erlang, what actually happens is that the = operator compares the lefthand side to the righthand side and tries to find a pattern that matches them. It finds that the lefthand side is an unbound variable, and the righthand side is a value, so it then sets the variable's binding to that value.

You'll actually get an error message if you try something like this:

X = 10.

You cannot change the state of a variable once it has been set. This prevents Erlang programs from relying on mutable state. Essentially, Erlang's notion of variables is closer to that of algebra than it is something like C or Java.

One limitation this imposes is that you can't do something like this in Erlang:

X += 1

Most state transformations that are done in functional languages are done by creating new values, often building them up recursively, so this issue isn't as much of a concern as it might seem at first.

Further Down The Bunny Hole, The = Operator Doesn't Mean Assignment

In Erlang, assignment is just one of the many things that you do with =.

In fact, when you write something like X = 3., it is interpreted as a pattern, and the left and right sides are analyzed. On the lefthand side, we find an unbound variable, and the right side, a value. Erlang figures out that the smart thing to do is bind that variable to the value and that is when assignment actually happens. However, when you attempt to assign a new value to X, the pattern is different. In this case, the lefthand side is recognized as a bound variable, and will throw an error unless the value is exactly the same as the one the variable is bound to.

We'll revisit this topic when we look at some other structures, but it's worth keeping in the back of your mind that when you see = in Erlang, you shouldn't immediately assume that it means assignment.

Atoms and Tuples

Atoms in Erlang are somewhat similar to symbols in Ruby, and are simply named elements that you can use as labels in your code. You can use any lowercase alphanumeric characters, plus the @ and _ to construct an atom. The following are all examples of atoms.

1> A = foo.
foo
2> B = foo_bar.
foo_bar
3> C = foo@bar.
foo@bar

You can do equality comparisons on atoms:

4> A =:= foo.
true
5> B =:= foo.
false
6> C =:= foo.
false

However, they are mostly used in conjunction with Tuples, which are sets of arbitrary objects.

For example, we may have an address tuple, made up of other tuples and some atoms and text:

In this example, the atoms are just used to help make it easier to know what the structure represents, and to make the patterns clearer. Though we won't cover them in this article, you'll want to have a look at Erlang's Records if you have a bunch of data with associative attributes.

Though you'll be sure to use plenty of atoms and tuples in your code, virtually all of your code will also make use of lists, so let's take a look at some of the common operations you can do with them.

Lists

Lists are collections of objects that have a head and a tail. The head of a list is the first element of the list in sequential order, and the tail is everything that's left over.

In a similar vein, you can also create anonymous function references to named functions:

6> example:call_with_five(fun example:fact/1).
120

You can actually do a lot more with Funs, but we'll stop here for now. It is worth noting that it is possible to create higher-order procedures by creating Funs which return Funs, but that's a bit out of the scope of this article.

For now, let's get back to list manipulations, where we'll use Funs to apply functions across all the elements of a list.

List Manipulations

Erlang comes with a bunch of handy ways to iterate over a list and do something with it. We'll take a look at how to loop over each element and perform an action, how to filter a list based on a boolean function, and also how to transform a list into a new list by applying a function to it element by element. From there, we'll take a look at List Comprehensions, which provide a more concise way to do several of these operations at once.

Basic Lists Module Features

Coming from any language with iterators, it's nice to be able to avoid manually traversing lists with indices. Erlang is no exception and makes such a task quite simple:

This essentially says for each element X in the list, give me the result of f(X). Many tasks will involve mapping a list of data based on a series of transformations, and this feature comes in handy for that.

We'll now take a look at a very powerful and syntactically elegant approach to the same problems, called List Comprehensions. These can help simplify your code and make it easier for you to quickly filter and transform a list.

List Comprehensions

A list comprehension allows us to simultaneously transform our data and filter it. Here are a few examples:

In each of these, you can see the general form: "f(E) for each E in L where the conditions are satisfied." The result of this is a filtered down and transformed list that doesn't require explicit creation of Funs or clever ways of constructing your code for efficient list processing.

List comprehensions are actually even smarter than what you see here, you can use a large range of predicates that help constrain your datasets. A simple example is shown here, but see the Erlang documentation for more details:

1> [E + 1 || E <- [1,"Kitten","Batman",5], is_integer(E) ].
[2,6]

You can actually also use the pattern for elements as a filter, as you can see here:

1> [ A || {A,B} <- [{a,b},{c,d},1,tomato,{e,f}] ].
[a,c,e]

All values that don't match the pattern are simply ignored in your list comprehension code.

Now that we've covered the basics of Erlang's elemental constructs as well as simple ways to traverse and manipulate lists, we're ready to take a look at a simple concurrent program.

Concurrency and Erlang

Though we've taken the express lane to get here, this article has covered all of the language constructs necessary to build simple concurrent programs. In a moment, we'll take a quick look at a simple chat program, which consists of two modules, a room, and users who log in to the room where they can communicate with others. Before we dive into those details however, we should talk about the basics behind Erlang's concurrency model.

Unlike state based languages which have shared memory and other messy issues which require things like mutex locks, Erlang's relatively strict functional model makes concurrency inherently more simple.

This is combined with a very pleasant way of handling events, which are interpreted in Erlang as messages in queue. Lightweight processes communicate with each other via messages, with all of the lower-level scheduling being done under the hood.

The most simple concurrent applications in Erlang simply spawn a process which runs in parallel to the other running processes. These processes then typically will enter a receive() loop, attempting to match patterns against messages that have been sent to the process.

A trivial example of this is a simple process which just prints out the messages it receives:

trivial_process.erl

In this code, our start() function spawns a process which kicks off the loop() function. It then waits to receive any message, which it then prints out. After it has completed printing the message, it starts the loop again, allowing it to process further messages.

Let's fire up the shell, create a process, and send some messages to it:

It's really this simple to spawn a process that listens for messages and responds accordingly. Of course, we're not putting any restrictions on who gets to send messages to the process or doing any error handling, but the simplicity of this code is still quite impressive.

Let's use these same basic ideas to take a look at and dissect my simple chat program, listed below:

This code implements the bare minimum for functional chat application, which include a room that can accept new users and broadcast messages, and a user which can say things or accept messages from the room.

Typically, it's best to use the process id that spawn returns for your message passing needs, because it is private and secure. However, in certain situations, you might want to register a process with a label so it is easily accessible. We have done this with room, and it allows us to send messages directly to this process using room ! some_message without knowing its explicit process identifier.

Our room module only matches two patterns, the first of which is a message which initiates a broadcast, where each user in the current list of users (except for the sender) is sent a message asking him to accept the broadcasted text. A list comprehension is used to walk a list of Pids that point to users who have joined the room.

The {From, add, MyName } pattern and associated add_user function simply broadcast a message to all present room members that a new user has entered the room, and then restarts the loop after adding the new user to the head of the list.

This is really all it takes to get basic functionality out of our room, the only important feature that has been left out to keep the example short is disconnecting users from the room.

When we look at the user.erl code, it is somewhat similar in nature, a simple processing loop that handles accepting and sending messages to other users via the room. You can see that when a user is created, she is autojoined to the room, which means the room must be started before any users can be created.

You can see how the room is acting as a server, broadcasting messages to its users. You can also see how as more users are added, the code simply does the right thing. Although we're just running these from the console with the same IO streams, these could conceivably be passing messages over a network and the code would work just as well.

I consider myself someone with a middling grasp of concurrency at best, and though I'd likely be scratching my head trying to write this in many other languages, I hacked this together in about 20 minutes with only a beginner's level of experience in Erlang. This certainly says a lot for the possibilities of the language to make parallel programming very easy.

Beyond the Basics

In this article, I've tried to cover as much of the essentials as possible without spending too much time on any one topic. There are some important beginner level topics I've skipped, such as Records and Guards. I've also skipped a number of the control structures such as the case() statement, and have not exposed many of the builtin functions that come in handy when programming in Erlang. I've also left out the details about a couple other surprises that are in store for people new to Erlang.

My best suggestion is to pick up Programming Erlang by Joe Armstrong and work your way through it. That's where I got most of the ideas and working knowledge to write this article, and it has much more in depth examples than what you've seen here. You can also check out the #erlang channel on Freenode for some helpful folks to ask for help. The users there convinced me to use this chat example instead of a more trivial timer example that I originally planned, and I hope that is for the better.

As always, I've made the source available for everything you've seen in this article, as well the example that didn't make the cut (see bomb.erl and bomb2.erl).

Seasoned Erlang hackers, please feel free to speak up if there are things in this article that have room for improvement. I'm mostly still at the exploring stage, and though I think this article will be helpful to others who are in the same ballpark, some notes from the experts can never hurt.

Gregory Brown
is a New Haven, CT based Rubyist who spends most of his time on free software projects in Ruby. He is the original author of Ruby
Reports.