I find postings like Joe’s highly valuable. Few among us are language designers, so rarely do those of us who aren’t get a first-hand account from someone who is as to why and how something within a given language’s design came to be. Joe describes the distribution primitives that Erlang provides as well as their composability; it might seem simple, but anyone who’s written non-trivial distributed computing infrastructure knows that choosing the right primitives and making the right design trade-offs is anything but simple. This explains why I continue to be so impressed with the design choices and trade-offs Joe and crew made for Erlang — I’ve simply never seen any distributed computing infrastructure so elegant and yet so practical and capable.

Responses

You’re right. Actually choosing the primitives was not simple. We tried (and implemented) dozens of different primitives and mechanisms for fault tolerance. Often things looked very good on paper, but when we implemented them we found that they were subtly wrong. Some things we thought were good turned out to be impossibly difficult to use. Sometimes mechanisms worked but the implementation was so complex that we decided the benefit wasn’t worth the cost so we threw away the implementation.

What survived this long process (which took 5 odd years) were the mechanisms that were easy to implement and easy to use. We basically added new features all the time keeping the features that the users liked and removing features that nobody used or which were difficult to implement.

The result is a small set of primitives spawn_link, trap_exits etc. that we know are useful and are relatively easy to implement.

While these primitives work for Erlang I doubt whether they work in general for other languages, when you look at the VM things like GC, error handling, process handling all interact in complex ways – having non mutable state makes
thread safety possible (not easy, but possible). How the
non-thread safe parts (ets, I/O) interact with the thread safe parts also took many man years to figure out (this was done by the OTP group members).

The trick is to make the difficult bits seem simple, but
this is itself difficult and takes a long time to get right.

When I see statements on the net that “X has Erlang like
semantics” I think it usually means “X has lightweight processes and message passing” but probably not thread safety cross-process distributed error handling and dynamic code upgrade.

If and when these language get to the point of handling errors and code upgrades in a manner similar to Erlang
they will be at the point where the Erlang community was about 15 years ago. All they now need is a set of libraries
that lever the primitives and package them in useful forms.

This is also difficult. The OTP libraries were the third major rewrite of the set of infrastructure libraries that we put on top of the Erlang kernel. This is not done overnight