Erlang for C, C++ and Java Programmers

This brief tutorial aims to cover the main concepts and syntax of Erlang in a way which programmers of C type languages can relate to. It is not a complete description of the language by any means but it does try to give a feel for the main conceptual differences in Erlang.

All comments are welcome – with feedback from readers this might even end up being useful!

1. Erlang is a small and easy to learn language

Erlang was designed from the beginning to be small and easy to learn – Ericsson had to train many C/C++ programmers in Erlang so it was part of the core brief to make it as easy as possible.

There are only a small number of concepts to master and most will already be familiar to C programmers.

2. Fundamental differences in approach

The biggest difference between programming in C and Erlang is in how programs are structured and in the fact that data is immutable – that is once created, you can't change it. You can only copy it and make required changes as you copy. (If you don't need the original any more, it is up to the garbage collector to discover this and reclaim memory.)

This makes the typical way of programming in C — using arrays, whose entries you keep changing, or records, with pointers to things that you keep changing — either wrong or potentially very inefficient.

So what does a program like Wings3D do, for structures (such as the winged-edge data structure) that have constantly changing subpieces? It would be too slow to copy the entire structure every time, say, you moved a vertex. The answer: extensive use of gb_trees and gb_sets. These are binary trees for holding dictionaries and sets, with the very useful property that when you copy them (in order to modify, add, or delete entries), you only change about log(number of elements) pieces of data; the rest of the trees are unchanged and can share the old data without copying.

So, for instance, where a C programmer might have a pointer to an edge struct inside a face struct, wings puts an integer index for the edge there, and has a separate gb_trees dictionary to map edge# -> edge record.

While this is no doubt still slower than the equivalent functionality in C or C++, there are benefits, the main ones being that you avoid a whole class of bugs, and that "Undo" becomes very easy to implement.

3. Data Types

Erlang has obvious basic data types such as floats, integers etc. There is no need to explicitly type data as Erlang variables carry the type of data round with them.

4. Compound data types

Erlang has a small selection of compound data types some of which will be familiar and some not.

Mostly where a C program would use a large or variable sized array (or a linked list), an Erlang program uses a linked list. Many (most?) Erlang functions and programs simply iterate over data held in linked lists (or other more complex data structures like gb_trees). Fortunately Erlang has special syntax for lists and provides nice ways to move around them. A linked list might be:

[1, 1.5, "Wings3D"]

An empty list looks like:

[]

There is another basic way to group data together and that is the tuple. This is a fixed size compound data type typically used to group small amounts of data together. For example the obvious representation of a 3D vector in Erlang is:

{1.0, 2.0, 1.5}

The final way of grouping data together is the record. This is similar to a struct in C. They must be declared before use in a compiler directive (starts with - ):

-record(vec, {x
,y
,z}).

Creating a record at runtime is done like:

#vec{x = 1.0
,y = 2.0
,z = 1.5}

5. Atoms.

Atoms are a feature not found in many languages but they are very useful in Erlang. An atom may be though of as a fixed label. They normally start with a lower case letter so:

6. Variables and Matching.

Variable assignment looks pretty much the same as in other languages but there is one very useful conceptual difference. Variable names must start with a capital letter (this is not it – patience!) so the following snippet works fine:

The useful difference is that variables are only assigned to once within any function, and the = operator doesn't in fact mean assign, it means "match with". This idea of "matching" is one of the very few important new concepts to get the hang of as it crops up everywhere and gives Erlang much of its elegance.

If a variable is fresh (or unbound) then matching it with some data has the effect of irreversably binding the variable to that value. After that attempts to change it will result in a runtime exception (useful for asserts amongst other things).

Matching is very powerful and also covers compound data types, so you can do:

{X, Y, Z} = {1.0, 2.0, 1.5}.

Here we match the tuple {X,Y,Z} against our vector, and assuming X,Y,and Z were fresh variables beforehand, after this statement they will each be bound to the value in the matching position. In other words we have extracted the 3 values from our Vector.

An equivalent example with records would be

#vec{x=X, y=Y, z=Z} = #vec{x=1.0, y=2.0, z=1.5}

7. Case and If

Now we have grasped the concept of matching we can see how exactly the same rules apply in two further Erlang constructs. Case stements are similar in form to C switch statements but are much more generic. For example study the following code snippet:

The case statement tries to match {X,Y,Z} (which in this case is the same as saying {1.0,2.0,1.5} as all the values are bound) against each case clause in turn. The first one which matches both form and content is selected, and the whole case statement evaluates to what comes after the -> (We'll do a bit more on this later).

So, first of all it tries to match {1.0, 2.0, 1.5} against the zero vector and fails. Then it tries to match it against {A,A,A}. This one looks a little odd – A is not yet bound so we might expect that A will match each of the values 1.0, 2.0, and 1.5. But this would break the rule that a variable can only hold one value. What happens is the variables are matched left to right – The first A is matched with 1.0 and thus bound to this value, then the second A (now bound to 1.0) is matched against 2.0, which fails and therefore this whole case clause fails.

Finally we try to match against the fresh variables X1, Y1, and Z1 – this succeeds and the result of the case is a new vector
{2.0, 3.0, 2.5}.

If statements work very much the same but they are more limited. XXXXX

8. Functions

You might think it odd to leave functions until now but they take advantage of the matching concept in a very powerful way. Here we also see another radical departure from C like langauges: a function in Erlang may have multiple definitions, and the one to be executed is selected by exactly the same matching rules as we saw in the case statement.

The first level of matching is based on the number of parameters, so the following are two different Erlang functions with the same name.

test(A) -> 1.

test(A, B) -> 2.

the -> construct should be read as "evaluates to". The final value in the function is the equivalent of the return value in C so the first one might look like:

test(int a) {
return 1
};

The second level of matching is between multiple function definitions with the same number of parameters. We can re-write an exact equivalent of our case statement using a single function with multiple definitions:

This would be the idiomatic way to write this in Erlang, and in fact (in R8B) is evaluated more quickly than the case statement.

In this type of function all the definitions must appear straight after one another in the source code file and be separated by semicolons. The final one is terminated by a full stop.

We also see here how to return multiple values from a function. There is no need to pass in pointers to a set of output variables to an Erlang function if you want to return multiple values – you just return a tuple. (In fact you can't affect the value of any variable passed into the function – Erlang functions are "side effect free" apart from one case we will touch on briefly when we talk about threads and message passing).

9. Iteration and Recursion

Now we get to the meat and gravy (or dhal and rice) of the Erlang programming model.

There is one new feature of lists we need to understand first though. That is the | operator. In a matching context this means "split a list up into two parts: the first element, and the rest of the list", otherwise it means "insert item at start of list". It is this which allows us to extract values from lists and make new lists.

So for example after a match:

[Head|Tail] = [1,2,3].

Head would be 1 and Tail would be [2,3]

Then if we did:

L = [Head|Tail].

L would be [1,2,3]

Ok, so much for that diversion. Let us say now that we have a list of vectors and we wish to add 1 to all the values. In Erlang we could use a recursive function like the following:

add([]) -> [];
add([{X, Y, Z}|T]) ->
[{X+1, Y+1, Z+1}|add(T)].

This is the last complicated thing we really need to understand to write complex Erlang expressions, so lets look in detail at our add function by working through an example. We will call our function with:

add([{1,2,3}, {10,11,12}, {20,21,22}])

The first time through add([]) doesn't match so it uses the second definition. This matches and as we enter the function X is 1, Y is 2, Z is 3, and T is the list [{10,11,12}, {20,21,22}]

This evaluates to the construction of a list with {2,3,4} at the head and the result of add(T) appended afterwards – it is starting to look like the result we want – a list starting with {2,3,4}, followed by the result of add([{10,11,12}, {20,21,22}]).

This is classic recursion – next time though we get {11,12,13} followed by the result of add([{20,21,22}] , next time {21,22,23} followed by add([]).

Short aside: if you take all the entries out of a list you get left with the empty list, similarly if you make a list by appending the empty list to a list you just get the list.

So when we call add([]) the first defintiion is evaluated which results in [] and breaks the recursion. Our result is

[{2,3,4} | {11,12,13} | {20,21,22} | [] ]

which by the above rules is simply [{2,3,4}, {11,12,13}, {20,21,22}]

It is worth experimenting with the | operator at the Erlang shell to get a handle on what it does.

There are a number of other iteration patterns or ways to write the same function. One very common version of this (which has the advantage that it executes in constant memory space) uses an accumulator parameter to build the result:

This pattern creates the list backwards so at the end we need to reverse the list.

10. Odds and Sods

It is possible to augment matching with guards. These may be used in function headers, case and if statements and are placed before the ->. Guard expressions must return either the atom true or false and must not have side effects.

There is a full list of valid guard expressions in the Erlang documentation

There is also a set of comparison operators including == which also return the one of the atoms true or false.

11. Global Data.

You may have been wondering how a program maintains state and how it is possible to have global data in an Erlang program when Erlang functions cannot change any data outside their own execution context. The short answer is that Erlang programs cannot have global variables in the same way a C program can.

They can however maintain state, and the structure in which state is maintained is.. wait for it.. a thread!

An Erlang thread is modelled as a recursive function which just keeps on recursing. The clever part is that it can be made event driven so it does not use CPU capacity while it is waiting. The state is held as the parameter(s) of the function. A thread might look like:

While keypresses are being received by the thread it carries on handling the keypresses and going back to wait at the receive statement for the next event.

In this way the state of an erlang system is distributed around all the threads which make up the system.

Threads communicate by sending messages which will be matched by the receive statement of the receiving thread. All threads have a Process ID or they can be registered with a name (which must be an atom). The message sending statement looks like:

Pid ! {keypress, enter}

So for example the Erlang system might have some process receiving raw binary data from the operating system upon each keypress. This process would decode the raw data and send the application thread the nice message {keypress, enter}.

This is exactly the sort of thing the esdl application does for Wings3D.

12. Modules

Each Source code file makes up one module. It must be named module_name.erl and have a -module(module_name). definition in the file. Module names have the same syntax as atoms.

To call a function in another module the format is:

module_name:function(Par1, Par2).

There are a huge range of modules which come with the Erlang system but one is special. The module erlang has a bunch of functions which are globally scoped and may be called from any module using the local form. These include the guard expressions so is_integer/1 is the same as erlang:is_integer/1

Some useful modules are maths and gb_trees.

13. Funs

Erlang has a feature where a generic function can be defined and assigned to a variable. It may be called any time later and when it is it 'remembers' the state of the variables at the time it was defined.

For example to define a function to add 1 to a number we could do:

Add = fun(Num) -> Num + 1 end. % Why not try it in the erlang shell

Then to use the function it is just:

Add(10).

An example of where we could see that a fun remembers variables would be this function which makes a fun to add any number to the parameter. e.g

This is another way to maintain state, and is used in Wings3D in a quite advanced way to keep track of which mode the gui is in. Wings3D hold a list of funs which it uses as a stack. When you right click on a menu the menu fun is added to the head of the list which is called to do it's stuff and afterwards removed from the top of the list revealing the previous state. This is tricky stuff..

Funs are also useful in iterating over lists and other data types. We can re-write our vector adding function using a standard function from the lists module which does the iteration for us. So: