Practical lock-freedom

This technical report is based on a dissertation submitted September
2003 by the author for the degree of Doctor of Philosophy to the
University of Cambridge, King’s College.

Abstract

Mutual-exclusion locks are currently the most popular mechanism for
interprocess synchronisation, largely due to their apparent simplicity
and ease of implementation. In the parallel-computing environments that
are increasingly commonplace in high-performance applications, this
simplicity is deceptive: mutual exclusion does not scale well with large
numbers of locks and many concurrent threads of execution.
Highly-concurrent access to shared data demands a sophisticated
‘fine-grained’ locking strategy to avoid serialising non-conflicting
operations. Such strategies are hard to design correctly and with good
performance because they can harbour problems such as deadlock, priority
inversion and convoying. Lock manipulations may also degrade the
performance of cache-coherent multiprocessor systems by causing
coherency conflicts and increased interconnect traffic, even when the
lock protects read-only data.

In looking for solutions to these problems, interest has developed in
lock-free data structures. By eschewing mutual exclusion it is hoped
that more efficient and robust systems can be built. Unfortunately the
current reality is that most lock-free algorithms are complex, slow and
impractical. In this dissertation I address these concerns by
introducing and evaluating practical abstractions and data structures
that facilitate the development of large-scale lock-free systems.

Firstly, I present an implementation of two useful abstractions that
make it easier to develop arbitrary lock-free data structures. Although
these abstractions have been described in previous work, my designs are
the first that can be practically implemented on current multiprocessor
systems.

Secondly, I present a suite of novel lock-free search structures. This
is interesting not only because of the fundamental importance of
searching in computer science and its wide use in real systems, but also
because it demonstrates the implementation issues that arise when using
the practical abstractions I have developed.

Finally, I evaluate each of my designs and compare them with existing
lock-based and lock-free alternatives. To ensure the strongest possible
competition, several of the lock-based alternatives are significant
improvements on the best-known solutions in the literature. These
results demonstrate that it is possible to build useful data structures
with all the perceived benefits of lock-freedom and with performance
better than sophisticated lock-based designs. Furthermore, and contrary
to popular belief, this work shows that existing hardware primitives are
sufficient to build practical lock-free implementations of complex data
structures.