A Peek Inside the Erlang Compiler

Erlang is a complex system, and I can’t do its inner workings justice
in a short article, but I wanted to give some insight into what goes on
when a module is compiled and loaded. As with most compilers, the first
step is to convert the textual source to an abstract syntax tree, but
that’s unremarkable. What is interesting is that the code goes through
three major representations, and you can look at each of them.

Erlang is unique among functional languages in its
casual scope rules. You introduce variables as you go, without fanfare,
and there’s no creeping indentation caused by explicit scopes. Behind
the scenes that’s too quirky, so the syntax tree is converted into Core
Erlang. Core Erlang looks a lot like Haskell or ML with all variables
carefully referenced in “let” statements. You can see the Core Erlang
representation of a module with this command from the shell:

c(example, to_core).

The human-readable Core Erlang for the example module is written to example.core.

The next big transformation is from Core Erlang to code
for the register-based BEAM virtual machine. BEAM is poorly documented,
but it’s a lot like the Warren Abstract Machine
developed for Prolog (but without the need for backtracking). BEAM
isn’t terribly hard to figure out if you write short modules and examine
them with:

c(example 'S').

The disassembled BEAM code for the example module is written to example.S.
The key to understanding BEAM is that there are two sets of registers:
one for passing parameters (“x” registers) and one for use as locals
within functions (“y” registers).

Virtual BEAM code is the final output of the compiler,
but it’s still not what gets executed by the system. If you look at the
source for the Erlang runtime, you’ll see that beam_load.c is
over six thousand lines of code. Six thousand lines to load a module?
That’s because the beam loader is doing more than its name lets on.

There’s an optimization pass on the virtual machine
instructions, specializing some for certain situations and combining
others into superinstructions. To check if a value is a tuple of three
elements is accomplished with a pair of BEAM operations: is_tuple and is_arity. The BEAM loader turns these into one superinstruction: is_tuple_of_arity. You can see this condensed representation of BEAM code with:

erts_debug:df(example).

The disassembled code is written to example.dis. (Note that the module must be loaded, so compile it before giving the above command.)

The loader also turns the BEAM bytecode into threaded code:
a list of addresses that get jumped to in sequence. There’s no “Now
what do I do with this opcode?” step, just fetch and jump, fetch and
jump. If you want to to know more about threaded code, look to the Forth world.

Threaded code takes advantage of the labels as values extension of gcc. If you build the BEAM emulator with another compiler like Visual C++, it falls back on using a giant switch statement for instruction dispatch and there’s a significant performance hit.