3.26.2008

The *Real* Erlang "Hello, World!"

This *is not* it:

-mod(hello).-export([start/0]).

start() -> io:format("Hello, World!").

I propose that the purpose of a "Hello, World!" program is to communicate something essential about the programming language in a small space. The program above does not achieve this - relative to implementations in other languages - predominantly because it omits anything to do with the Actor model, which is a core part of what makes Erlang interesting.

I propose that the following should be considered The Real Erlang "Hello, World!":

-module(hello).-export([start/0]).

start() -> spawn(fun() -> loop() end).

loop() -> receive hello -> io:format("Hello, World!~n"), loop();

goodbye -> ok end.

Let's dissect this example to see why. To run this program, install Erlang, fire up the Erlang REPL erl.exe and follow along.

First we compile and load the program with the commandc(). Note that we omit the ".erl" file extension when referring to the module. Also note that I startederl.exe in the directory containing hello.erl such that I was not required to type in the full path.

1> c(hello).{ok,hello}

Erl responds with ok and the name of the compiled module.

The start() function is the only function we can invoke in the hello module, because it's the only one that is exported, as per the module's export statement. This is how Erlang implements encapsulation, in that the exported functions form the public interface of the module. The list of exported functions are of the form name/arity, where name is the name of the function and arity is a formal way of saying "the number of arguments it takes".

Invoke the start() function within the hello module, assigning the return value to a variable called Pid:

2> Pid = hello:start(). <0.36.0>

The spawn function returns a Pid - a Process Identifier - which is a first-class Erlang data type. We assign this return value to a variable of the same name. (We could just as easily have assigned it to a variable namedFoo, but using Pid is fairly common). Note that variables in Erlang need to start with an uppercase letter.

Erl responds by pretty printing the process identifier <0.36.0>; all valid expressions in Erlang have a return value.

At this juncture, if you try to assign any other value to Pid, you will get a badmatch exception. Once a value has been bound to an identifier, it cannot change: Erlang is a single-assignment language. The benefits of this paradigm include the ability for the compiler and runtime to make fancy optimizations, and it also greatly eases debugging because variables are immutable.

The Sharp End

The spawn invocation starts an Erlang process which wraps the loop() function just below it. (Note that Erlang doesn't impose any order of definition on functions). Erlang processes are the essence of programming in Erlang, and the essential missing element in simpler "Hello, World!" examples. Processes are the Erlang implementation of the Actor model: extremely lightweight concurrency primitives that communicate purely by message-passing. They have nothing whatsoever to do with operating system processes, threads or similar, and are managed entirely by the Erlang runtime.

The process waits (semantically at the receive statement) for a message which matches one of its receive clauses.

We can send a message to the process using an exclamation mark (the message-send operator) followed by the message. We can see that the receive block has two clauses which match both hello andgoodbye.

We invoke the code within the 'hello' clause by sending the corresponding message to our cached Pid:

3> Pid ! hello. Hello, world!hello

As we expect, our process responds with, "Hello, World!". And as noted before, Erlang returns a value for all valid statements, this is why we see hello printed out immediately following the output of io:format.

The following line does a tail-recursive call back to loop(). In case you didn't follow the link and aren't completely familiar with tail recursion, you should know that tail-recursion is the bombay duck of computer science: there is no recursion going on, at least in the sense that anything is left on the stack. Tail recursion is a means of efficiently calling the current function, and is more akin to a goto or a jump instruction than the terminology would have you believe.

So, given the tail-recursive call back to loop(), the process is once again put back into the wait state. We could send the hello message to Pid ad nauseum and the process would simply repeat.

Now we send the goodbye message:

4> Pid ! goodbye. goodbye

The crucial difference between this clause and the clause that matches the hello message is that this clause does not include a tail-recursive call back to loop(). As a result, the process effectively dies. We can confirm this by attempting to invoke the code in the hello clause once again:

5> Pid ! hello. hello

And we see that no output is generated.

The last important detail that I have omitted is the type of hello and goodbye. These are erlang atoms, an extremely simple data type whose value is itself. Atoms are used heavily in message-passing (and other pattern-matching contexts) and are very easy to work with: you simply declare and go!

Re-Entry Checklist

Although the explanation has been verbose, I hope you agree that this Erlang "Hello, World!" communicates some interesting essentials of the Erlang programming language. These essentials concern in particular how Erlang implements the Actor Model, which is the kernel of its message-passing semantics and a key enabler for Erlang's capability for massively concurrent processing.

The original purpose of "hello world" was to have a suitably trivial item with which to check that one could successfully write, save, compile and execute a program at all in some new and unfamiliar programming environment. A new OS, or such like.

"Hello world" began as a smoke test for your ability to do any software development at all, and not a language tutorial. The point being that the program itself should be about the least complex one that would actually cause an observable side-effect in the world, so one could focus on the oddities of the environment.

Unfortunately you are forgetting one thing, while the call to io:format may look like your typical "run of the mill standard sequential language do it all in one process function call", internally there is enough concurrency and actor stuff going on under the hood to make most people happy.

That's what "Hello, world" may have been invented for, but that has changed. In this day and age, we can reasonably expect the activities you cite to succeed without inordinate effort on the programmer's part.

I would say that "Hello world" has now rather become a means to understand how to perform those activities, and to provide some minimal insight into the programming language.

To quote Wikipedia (which we all know is the definitive source of All Things Correct ;-)

"Experienced programmers learning new languages can also gain a lot of information about a given language's syntax and structure from a hello world program."

Glad you enjoyed it. Happy travels in Erlang-land!---@Robert Virding:Unfortunately you are forgetting one thing [...] there is enough concurrency and actor stuff going on under the hood to make most people happy.

Great to have you by, Robert.

[Readers: Robert is the author of an Erlang book that predates the Prags book. And the author of Lisp-Flavored Erlang, a lisp syntax front-end to the Erlang compiler. That means he totally kicks Erlang ass.]

Thanks for the heads up; I wasn't aware that io:format() was so sophisticated under the covers.

However, I wouldn't expect most hackers to read the implementation of io:format() any more than reading the printf() implementation for a "Hello world" in C.

Hi Edward, I would suggest making the example clearer by not giving the same names to functions and atoms. For example, call the atoms that hello() accepts "greet" and "leave" instead of "hello" and "goodbye" (and another benefit of that is that you can have the function print "Goodbye world!" to the screen when it receives the "leave" message).

Yes, that is absolutely true. Messages that do not match any of the receive clauses are moved to the process' save queue. When the process next receives a message, and if that message matches, then all the messages in the save queue are put back into the mailbox in the order in which they were received and reprocessed. The reason why is that the next received message can mutate the process in such a manner that messages that did not previously match would subsequently match (think guard statements for one).

The save queue works this way to enable flexibility with regards to message processing. And of course you would not design your system to send messages without expecting them to be processed at some point!

I believe that this is definitely a good hello world tutorial for Erlang. With the number of languages and the increasingly large amount of features they have, it is better to have a more verbose hello world that demonstrates more of the language syntax as opposed to a simple print statement.

I propose that hello world be broken into a 2 step process. The first step would be following the original hello world standards, as in just printing something. This shows the basics and how to compile something in the language. The second step would be what you demonstrated here, a much more in-depth explanation of the language that has much more practical value to those who may actually use the language and stick with it.

I definitely found this to be of value and was much easier to understand then another Erlang tutorial I previously found.