Caribou

Caribou

In this project we look beyond logical specialization of storage nodes for data processing applications and explore how physical specialization can offer benefits in terms of throughput and latency, but not only. Thanks to hardware pipelining, complex application-specific processing can be pushed down to the storage without impacting performance. The resulting system, Caribou, provides a key-value store interface common to many data processing applications and has a modular architecture that allows plugging in different application-specific processing units (e.g., complex filtering predicates for SQL queries).

Caribou offers a key-value store API and the ability to push down application-specific processing make it suitable for prototyping different kinds of smart storage. Since the processing logic is plugged into the key-value store using simple streaming interfaces, the design of compute units is greatly simplified and data management is already taken care of. This reduces the ``entry barrier'' to exploring new ideas.

Consensus in a Box

Caribou relies on our previous work on FPGA-based distributed key-value stores. We showed that by carefully tuning the design of the hash table to the underlying FPGA, it is possible to achieve an order of magnitude larger performance compared to the state of the art in x86 processors. An FPGA-based key-value store connected directly to the network not only could replace several regular servers in terms of performance, but also dramatically reduces round-trip latency, and increases energy efficiency. We show that it is possible to provide fault tolerance at scale in hardware, and that consensus (Zookeeper's Atomic Broadcast, ZAB) can be removed from the critical path of performance by moving it to hardware. We implemented an all-hardware equivalent to Zookeeper that uses both TCP and an application specific network protocol. The design can be used to push more value into the network, e.g., by extending the functionality of middleboxes or adding inexpensive consensus to in-network processing nodes.

Code

Caribou (and the consensus logic as part of it) is available on Github. The repository contains documentation and a guide on how to build the project. Please don't hesitate to email us with questions.