Optimistic concurrency control differs from the use of locks fundamentally, in
that it is based on the assumption (observation) that in many systems,
the vast majority of operations are independent. To this end, it avoids the
expensive synchronisation and checkpointing necessary for transaction systems.
[Ref: Strom/Yemini].
The distributed system is divided into a number of logical recovery units.
Each of these periodically checkpoints its state, and continuously
logs the stream of messages arriving at it, in such a way as to be
able to roll back its state to the previous checkpoint.
The state intervals of the system are partially ordered
by a ``causal relation''. This is so that when there is a fault that
leads to some partial failure in the system, the task of rolling back
the whole system to a know state is a tractable one, since the system
is modularised.
The idea of optimistic concurrency control is that by analysing the
dependencies, one can generate ``commit guards''.
Committ guards are effectively the list of other recovery unit states that
this execution is dependent on. Roll back based on logged state is
only necessary if at some later stage it turns out one of the recovery
units failed, perhaps with a lost message. Usually, systems built
around this approach also engineer their networks to make message loss
a rare event. Thus this is really an approach for partitioning up the
recovery problem into smaller pieces so that the cost of keeping
enough information to carry out roll-back is not too high in any one
part..