Concurrency Improvements in HyperLevelDB

LevelDB is a popular data back end that was originally developed by Google for
use as a stand alone key-value store library. Since Google originally open
sourced LevelDB, it's been widely adopted and modified by others, including by
Facebook, Basho, and us at HyperDex. While LevelDB provides a solid
foundation for building data-intensive applications, there are many possible
ways in which its performance can be improved for a variety of workloads.

In this article, we look at how some recent improvements to HyperLevelDB, the
HyperDex fork of LevelDB, improve concurrency for multiple writers.

Improving Concurrency

1 thread

LevelDB with one thread writing 128B values.

The various LevelDB forks provide decent performance with a single writer
thread. The graph to the right shows the performance of LevelDB with a single
writer that inserts 128-byte objects as fast as it can. HyperDex's and Basho's
forks exhibit significantly higher throughput than LevelDB or RocksDB, reaching
about 275K operations per second in this single thread. Google's LevelDB and
Facebook's RocksDB achieve more modest throughput.

If you were building an application on LevelDB, you might be tempted to scale
your application by adding additional threads. After all, modern servers, even
virtual servers, have multiple CPU cores available.
Intuitively, we would expect that adding an
additional writer thread would increase the application's total throughput, but
this is not the case. The graph to the left shows the aggregate throughput of
two writer threads inserting the same data used in the first benchmark.

2 threads

LevelDB with two threads writing 128B values.

The first three LevelDB variants actually show a decrease in overall throughput,
despite having twice the computing power available. Until recently, the
HyperLevelDB benchmark would have showed a similar decrease in throughput, but
the graph shows, our recent optimizations increase throughput to over 350K
operations per second with the second thread.

As concurrency increases with additional threads, other LevelDB variants
continue to see their performance degrade, while HyperLevelDB performance
increases with each additional thread. The graph to the right shows the
throughput when running four threads on our quad-core system. HyperLevelDB's
throughput is 2-4 times higher than the other variants.

4 threads

LevelDB with four threads concurrently writing 128B values.

HyperLevelDB's performance stems from the following changes:

Reduce the time locks are held: Where possible, we are reducing the time
period for which locks are held. By shortening the time threads spend holding
locks, we significantly reduce the likelihood that another thread tries to
acquire the lock in the same time frame.

Use fine-grain locking: LevelDB's design has a single mutex that protects
all internal state. Each thread acquires this mutex before modifying any
internal state, and only releases it when done. In HyperLevelDB, we have
switched to finer-granularity locks to permit more threads to concurrently
manipulate the internal state at the same time without any loss of safety.

Employ lock-free data structures: We've modified several internal
structures to be lock free, eliminating the need for blocking-based
synchronization such as mutexes.

Final Thoughts

HyperLevelDB is free and open-source. It provides an identical API to LevelDB,
and maintains compatibility with the on-disk format. If you are interested in
improving the performance of an application that uses LevelDB, you might want to
try dropping in HyperLevelDB.

Some additional resources that may be of interest:

level-hyper is the Node.js
HyperLevelDB wrapper. It enables you to use HyperLevelDB from within Node.js,
and take advantage of many of our optimizations. Internally, level-hyper uses
a thread pool to issue writes to the database and therefore takes advantage
of the concurrency improvements described above, even for single-threaded
Node apps.

HyperDex.org is the home of HyperDex, a distributed
key-value and document store built on top of HyperLevelDB. Many of our changes to
HyperLevelDB are driven by HyperDex.

If you like the improvements we're making to LevelDB, help us fund further
HyperLevelDB development. We provide support contracts for HyperLevelDB.
If you use HyperLevelDB in
your application, consider helping support its further development.