Monday, May 28, 2007

Cicada: Erlang Style Message Passing for SISC

Erlang - Always a contender

I'm a big fan of the Erlang programming language. I just love how it can be used to build large systems with relative ease.

If you spend just a few minutes dabbling with Erlang you quickly learn that its power is in that it allows you to view the world as loosely connected, totally independent processes. This is a good thing, because it's exactly these characteristics that make for a massively scalable system.

But, at the end of the day, when I dabble with personal projects, I usually choose Scheme, and specifically SISC over Erlang. I do this for a bunch of a reasons, one of the main ones being easy access to existing Java APIs. It's just too valuable to be able to take advantage of advanced linguistic features and get access to mainstream libraries and code.

Termite - the best of both worlds?

Then along came Termite - an implementation of an Erlang style language on top of Scheme. Suddenly, I didn't need to choose between the features of Erlang and Scheme - I could have them both. This was good.

Except for the fine print - Termite is only available for Gambit-C. Gambit, I'm sure, is a terrific implementation of Scheme. But, as I mentioned above, I'm not ready to give up my free Java access.

Cicada - Just a bit of Erlang and Termite

Which brings us to Cicada - a really quick and dirty implementation of Termite, but for, you guessed it, SISC.

First, the bad news:

SISC doesn't have particularly lightweight processes. That's one of the cool aspects of Erlang and Gambit-C - you can make millions of simultaneous processes. This means that if you are modeling huge numbers of processes, you probably should be steering clear of SISC in general. But, with that said, SISC delegates threading to the JVM, which no doubt is getting more flexible all the time. And a really quick Google search turned up a bunch of options for scaling Java threads

I've only implemented the concept of local processes. Termite offers support for spawning processes across nodes and sending messages between nodes (read: machines). SISC has full support of serializing all sorts of goodies, including continuations, so should I (or someone else, hint, hint) need this functionality, adding it shouldn't be a problem.

There are other features that are missing too - like support for connected processes. Again, when I need this functionality, I'll no doubt add it in.

I through this all together in a weekend. This is as beta as it gets. Feel free to play with it at your own risk. I'm quite certain there are still threading concurrency issues in the code.

Now for some good news:

It does indeed work! You can actually model problems the same way using Cicada as you would with Termite or Erlang. You can spawn processes and exchange messages. There's even support for pattern matching of received messages.

I was able to trace my way through the paper on Termite and recreate may of the examples, including some of the more sophisticated ones, like updating code in a running process.

I suppose the best news about Cicada is that even in the short term I've played with it, it's given me the chance to think about solving problems from a different perspective. It's like playing with OO for the first time.

Getting Access

You can download the source code for Cicada here. I've written it against SISC 1.16.6.

You're best bet for getting started is to read through com/ideas2executables/concurrent/cicada-test.scm, as well as the source code itself in com/ideas2executables/concurrent/cicada.scm.

I may post some examples on the blog - so stay tuned.

What's in a name?

Why name the package Cicada? Well, for one thing, I wanted to carry on the naming convention started by Termite of using bugs for this sort of thing. After a bit of poking around, I settled on cicadas as my bug of choice. They are actually remarkable insects - seeing as they manage to cool themselves by sweating, and around here only pop out of the ground every 13 years. And, just like this software package is missing a few features, cicadas are missing some of their own - like a mouth, for example.

6 comments:

Regarding the threading issue: if SISC continuations cheap enough, maybe they can be used to implement a threading system on top of them. That's how Gambit-C threads work.

Also, I heard that Marc Feeley (the author of Gambit-C) has a student who works on a portable implementation of Termite. It should be available later this summer. I hope we'll be able to code distributed applications using heterogenous Scheme systems this way.

I am a bit puzzled by the main advantage of SISC over Erlang ... I have been working with java for years and my humble impression is that java fails miserably on the libraries side (compared to other languages).

I'd have to say SISC Scheme's main advantage over Erlang is that it runs in a JVM.

That's the situation I'm in right now: I want to use a better language than Java, but I have to play the "deployable" game. SISC's conformance to R5RS and native Java interaction is really killer, such that I can make an app or library callable from Java, and no one is the wiser that I'm using Scheme.

I like Erlang's concurrency model, and the particular problem I'm trying to solve fits very nicely into the domain. I'd like to explore it to see how closely it maps, but I can't deploy Erlang on Java.

Thanks to Termite and Ben with Cicada...this is getting a lot closer. If I have free time (yeah right!) I may look into using UBF for serialization between Termite or Cicada nodes to see if that will reduce some of the marshalling overhead.