2014.08.8 11:04

Understanding the difference between Erlang processes and OS processes can be a bit confusing at first, partly because the term “process” means something different in each case, and partly because the semantics of programming terms have become polluted by marketing, political and religious wars. A post to the Erlang questions mailing list asking why Erlang processes are so fast and OS processes are so slow reminded me of this today.

Erlang processes are more similar to the “objects” found in most OOP languages than the “processes” managed by an OS kernel, but have a proper message passing semantics added on in a way that abstracts the OS network, pipe and socket mechanisms. We wouldn’t be surprised if the Python runtime handled its objects with less overhead than the OS kernel handles a process, of course, and it should come as no surprise that the Erlang runtime handles its processes with less overhead than the OS kernel. After all, a Python “object” and an Erlang “process” are very nearly the same thing underneath.

Most OOP runtimes implement “objects” as a special syntactical form of a higher order function, one that forms a closure around its state, includes pointers to methods as a part of that state (usually with their own special syntax that abstracts the difference between a label, a pointer and a variable) and returns a dispatch function which manages access to its internal methods. Once you get down to assembly, this is the only way things work anyhow (and on von Neuman architectures there is exactly zero difference between pointers to data, pointers to data, instructions and pointers to a next instruction). If you strip that special syntax away there is no practical difference between directly writing a higher order function that does this and using the special class definition syntax.

Even in a higher language the higher-order functional nature of an “object”s class definition can be illustrated. For example, the following Python class and function definitions are equivalent.

We would be utterly unsurprised that both the class definition and the function definition return entities that are lighter weight than OS processes. This is not so far from being the difference between Erlang processes and OS processes.

Of course, the above code is ridiculous to do in Python either way. The whole point of the language is to let you avoid dealing with this exact sort of code. Also, Python has certain scoping rules which are designed to minimize the confusion surrounding variable masking in dynamic languages — and the use of a dictionary to hold the (X, Y) state is a hack to get around this. (A more complete example that uses explicit returns and reassignment is available here.)

For a more direct example, consider how this can be done in Guile/Scheme:

OOP packages for Lisps wrap this technique in a way that abstracts away the boilerplate and makes it less messy, but its the same idea. This can be done in assembler or C directly as well. Equivalent examples are a bit longer, so you’ll have to take my word for it. (A commented version of the Guile example above can be found here.)

While OOP languages typically focus on access to state and access to methods as state, Erlang focuses like a laser on the idea of message passing. Easy, universal access to state in OOP languages makes it natural to do things like share state, usually by doing something innocent like declaring a name in an internal scope that points to an independent object from somewhere outside.

Erlang forbids this, and forces all data to either be a part of a the definitions that describe the process (things declared in functions or their arguments), or go through messages. Combined with recursive returns and assignment in a fresh scope (akin to the last Python example in the extra code file) this means state is effectively mutable and side effects can occur without violating single assignment, but that everything that changes must change in an explicit way.

This restriction comes at the cost of requiring a sophisticated routing and filtering system. Erlang has an unusually complete message concept, going far beyond the “signals and slots” style found in some of the more interesting OOP systems. In fact, Erlang goes so far with the idea that it abstracts message, filters, a process scheduler and the entire network layer with it. And hence we have a very safe environment for concurrent processing — using “processes” that certainly feel like OS type processes, but are actually named locations Erlang’s runtime keeps track of in the same way an OOP runtime does objects, functions and other declared thingies. They feel like OS processes because of the way Erlang handles access to them in the same way that Java objects feel like my mother-in-law’s purse because of the way the JVM handles access to them — but underneath they are much more alike each other than either are to OS processes.

In the end, all this stuff is just long lines of bits standing in memory. The special thing is the rules we invent for ourselves that tell us how to interpret those bits. Within those rules we have various ways of declaring our semantics, but in the end the lines of bits don’t care if you think of them as “objects”, as “processes”, as “closures”, as “structs with pointers to code and data” or as “lists of lists with their own embedded processing rules”. OSes have particularly heavy sets of rules regarding how bits are accessed and moved around. Runtimes tend not to. Erlang “processes” are of a kind with Python “objects”, so we shouldn’t be surprised that they are significantly lighter weight than the “processes” found in the OS.