12.29.2007

So after some nudging by a friend, I started digging into Erlang over the summer. I do a lot of work trying to make software systems scale with high availability, so I was instantly seduced by the claims that there are operational erlang systems that have achieved nine 9's of uptime. I've worked on many systems that had SLAs of 99.99 (four 9's) and seen telecommunication systems that had to support seven 9's (mostly facilitated with massively redundant hardware like Tandem non-stop servers, costing 6 or 7 figures each). However Erlang has achieved nine 9's (that's 3 seconds of system interruption every 100 system years) on relatively commodity hardware. How? Well there are a number of features that make this possible, it's a side-effect free functional language, you can do online upgrades to software modules without impacting existing operations (i.e. the old version continues to run, the new version handles new requests), and there is truly transparent distributed computing possible through inter-node message passing. But on top of all this, erlang is really a concurrency-oriented programming language that can take advantage of the coming cpu core densities.

To get a little deeper on erlang and see if anyone was really using it beyond ericsson (who commissioned the development of erlang 20 years ago), I packed my bags and headed to Freiberg Germany for the International Conference on Funtional Programming, and especially the Erlang Workshop. What I found was that telcos, ISPs, startups, and financial institutions were all actively using erlang. I also had the opportunity to meet Joe Armstrong (the father of erlang and the author of the erlang book. We had a 30 minute conversation on erlang architecture patterns for large scale distributed systems across multiple sites. Joe is such a great guy, passionate and down to earth despite being rather brilliant.

The book by the way is excellent, I highly recommend anyone with a java or C# background who feels like they've conquered OO and multi-tier applications pick this up, it will open your eyes to another world of architectural possibilities.

Subsequent to the conference, I learned that Google is looking at erlang, and Amazon is actively using it. Most recently I learned that Amazon has a limited beta of their Simple Database Service (SDS). While initially SDS is intended for only small scale DB requirements, if history is any gauge, just as EC2 now offers 8 core instances, I suspect SDS will have an 8-way option by 2009. This could be game changing technology, just as EC2 and S3 are bearing out. Not only that, it's a huge deal for erlang since SDS is written in erlang!

The most exciting thing to me about erlang thought, is the potentially for organic scalability in the face of increasing density of likely slower processing cpu cores. At the conference, a guy from intel that heads up developer relations said that we may expect 8 core and possibly 16 core cpus in 2008, 32 cores in 2010, and 100 cores as soon as 2012. There's a programming crisis coming with this as the mutex / lock-based models of concurrency in leading languages like C++, Java, & C# falls on it's face around 25-30 channels of concurrency, or at best it results in programs full of bugs. If you don't believe me, check out project wide finder where you'll see how these conventional languages do up against erlang. This threshold will happen much sooner than 32 cores, for along with greater core density is greater density of hardware threads. In the 8 core generation we can likely expect to see 4 hardware threads per core (32 hardware threads per cpu). I'm guessing we can expect to see dual 8-core cpu 1U boxes out next year in the $5000-7000 range, meaning these will become commodity boxes in 2009, and therein we'll likely to start seeing some really challenging programming problems.

Erlang avoids these problems by first being a functional programming language free of side effects, second through the use of an atomic model of concurrency through message passing to autonomous processes, and third by having an SMP scheduler that shows it will likely have linearly scalable performance up to 100 cores (as overheard at the erlang workshop, even erlang will have to do some work to get beyond the 100 core hurdle).

It should be noted that erlang currently powers one of the highest performance web servers, and powers the most performant and scalable SIP server and Jabber server in the world. However, erlang practically is best suited for plumbing and systems development (whole system development as opposed to being a general purpose web application development platform, for instance). I am likely to tackly an erlang project in the next year and will share my experiences here.