An AspectJ library for fault tolerance

As a course project at McGill, I developed a little library for fault tolerance, written in AspectJ 5. It contains two components:

Support for automatic N-Version programming.

A full implementation of a recovery cache.

The fundamental concepts involved are explained in Joerg Kienzle’s lecture slides. N-Version programmin basically allows you to implement several different versions of an algorithm. With the library, those versions are then automatically run in parallel, synchronized and a voter decides the overall outcome of the concurrent computation (usually voting based on a majority vote). Here is some example code:

This snippet of code associates three versions with the same group. In the last line, where we kick off the computation, an around-advice will automatically start all three versions and then return the voted result instead of the original one.

The recovery block on the other hand allows you to do checkpointing on the fly: Just call the checkpoint method, do some state changes and then, for instance in case of an exception, call restore(). All the previous heap state will be restored for you.

The library can be downloaded in the form of two Eclipse projects here: