I work for Red Hat, where I lead JBoss technical direction and research/development. Prior to this I was SOA Technical Development Manager and Director of Standards. I was Chief Architect and co-founder at Arjuna Technologies, an HP spin-off (where I was a Distinguished Engineer). I've been working in the area of reliable distributed systems since the mid-80's. My PhD was on fault-tolerant distributed systems, replication and transactions. I'm also a Professor at Newcastle University and Lyon.

Friday, April 06, 2012

Transactions and parallelism and actors, oh my!

In just 4 years time I'll have spent 3 decades researching and developing transactional systems. I've written enough about this over the years to not want to dive in to it again, but suffice to say that I've had the pleasure of investigating a lot of uses for transactions and their variations. Over the years we've looked at how transactions are a great building block for fault tolerant distributed systems, most notably through Arjuna which with the benefit of hindsight was visionary in a number of ways. A decade ago using transactions outside of the database as a structuring mechanism was more research than anything else, as was using them in massively parallel systems (multi-processor machines were rare).

However, today things have changed. As I've said several times before, computing environments today are inherently multi-core, with true threading and concurrency, with all that that entails. Unfortunately our programming languages, frameworks and teaching methods have not necessarily kept pace with these changes, often resulting in applications and systems that are inherently unreliable or brittle in the presence of concurrent access and worse still, unable to recover from the resultant failures that may occur.

Now of course you can replicate services to increase their availability in the event of a failure. Maybe use N-version programming to reduce or remove the chances that a bug in one approach impacts all of the replicas. But whereas strongly consistent replication is relatively easy to understand, it has limitations which have resulted in weak consistency protocols that trade off things like performance and ease of use for application level consistency (e.g., your application may now need to be aware that data is stale.) This is why transactions, either by themselves on in conjunction with replication, have been and continue to be a good tool in the arsenal of architects.

We have seen transactions used in other frameworks and approaches, such as the actor model and software transactional memory, sometimes trading off one or more of the traditional ACID properties. But whichever approach is taken, the underlying fundamental reason for using transactions remains: they are a useful, straightforward and simple mechanism for creating fault tolerant services and individual objects that work well for arbitrary degrees of parallelism. They're not just useful for manipulating data in a database and neither are they to be considered purely the domain of distributed systems. Of course there are areas where transactions would be overkill or where some implementations might be too much of an overhead. But we have moved into an era where transaction implementations are lighter weight and more flexible than they needed to be in the past. So considering them from the outset of an application's development is no longer something that should be eschewed.