Erlang Style Concurrency for .NET Applications Part 1 - CCR

Erlang allows for massively scalable concurrency, often with millions of lightweight, thread-like components known as actors. Unfortunately, using Erlang requires rewriting all of your legacy code into a rather esoteric language. But there are other options, such as the little known CCR platform that was developed by .NET's robotics department.

Actor based languages such as Erlang are able to achieve high degrees of parallelism by using the Actor model. Under this model the fundamental unit of concurrency is not a thread or fiber, but rather something much smaller. Known as a "process" in Erlang, each unit of concurrency has a base overhead of about 1200 bytes on a 32-bit system. By comparison, a thread on the Windows operating system defaults to 1 MB just for the stack, additional space is also needed for bookkeeping and thread local storage. Because they are so lightweight, an application can spawn literally millions of processes simultaneously.

At any given time most processes will be idle. When a process receives a message, the platform assigns a thread to it so that it can respond to the message. This response may include creating new processes, sending messages to other processes, and/or changing its own state. Once the message is handled, the process either dies or blocks on waiting for the next message.

High levels of concurrency and performance are achieved via the message passing system. Each message is sent asynchronously, allowing for a high degree of independence among the processes. The messages also allow the platform to know which process to awaken. Since a process can be executed on any thread, there is a greatly reduced need for relatively expensive context switching.

The .NET answer to Erlang's model is the CCR or Coordination Currency Runtime. The CCR, originally pioneered for robotics, is already finding acceptance in the wider marketplace. One developer at Siemens Automation was able to integrate CCR into their current backboard code base in only a few days. Blackboard, which routes mail using AI agents and conveyer belts moving at 10 meters per second, has millions of lines of legacy code. Tyco, a security company that works with everything from small stores to the Whitehouse, was also able to integrate CCR within a week. These were not solicited case studies; both Siemens and Tyco did the bulk of the work without the assistance or knowledge of Microsoft.

The core of CCR is a new API level concept known as "ports". Instead of calling methods on classes, developers wanting to perform an action need to post messages on a port. An arbiter attached to the port reads messages and, if they satisfy certain criteria, bundle the message into a task. That task is then placed on a dispatcher queue to be executed by the thread pool at a later time.

Arbiters form the basis of coordination primitives. They can be as simple as a single port receiver or a join/choice across multiple ports. They can even be composed when more complex logic is needed. However constructed they ultimately have one purpose, to wake up a bit of code when data is received.

The real power of CCR comes from combining it with C#'s iterator syntax, "yield return". Yield return is a type of continuation, a way to pause a thread of execution and continue it later without having to suspend a real thread. Normally it is used only for iteration, but with CCR it can be extended for any type of asynchronous operation. The real beauty of it is that it does so without drastically changing your code.

This code sample from the 2008 PDC shows making an asynchronous call using a Port. Rather than blocking or using the asynchronous call pattern with its explicit callbacks, one simple uses a yield return statement to leave the function. Once the data is received and a thread is available, the function continue on the next line as if nothing happened.

If you follow all of the concurrency articles on this site, you'll see that the trend in computing is headed toward exploiting multiple processors and distributed computers. It looks like this CCR will put a nice layer of abstraction over threads, which is nice. I think having that level of abstraction is necessary because developers probably won't know the "right" number of threads that should be running over a given processor (I certainly don't), and things like the CCR can decide that for us: we make actors; the CCR spawns the ideal number of threads to run our actors.

However, it doesn't look like it addresses the other trend: distributed computing. In languages like Erlang and Scala, it's just as easy for me to message an actor locally as it is to message an actor remotely. That isn't mentioned here.

And to be honest, I don't like the syntax I see above. Pattern matching in Erlang and Scala is SOOO much more elegant.

I'd be interested to see what people's thoughts are on how the CCR addresses shared memory. It is the actors pattern, so messages are being passed back and forth, but C# doesn't have immutable data, so you'd have to be especially careful with it.

It's not enough. What about collections? Even array is a reference type structure, not talking about lists and others. Collections should be memory-optimized to be efficiently immutable.I'm mainly a .NET developer, but I also agree that Erlang is much more convenient for distributed/parallel computing. At least so far...