CoreOS Blog

Blog Menu

Java and etcd: together at last, with jetcd

A reliable key-value store gives distributed systems a common substrate for consistent configuration and coordination. One such system is the etcd project, an open source key-value store created by CoreOS. It is the heart of many production distributed systems and is the data store for Kubernetes, among other projects.

Java has proven itself a popular distributed systems language including notable use in the Hadoop ecosystem, the Cassandra datastore, and cloud infrastructure stacks. Further it remains a hugely popular language. Just look at these stats of Java's dominance on Google trends:

In terms of Google searches, Java remains more popular than Microsoft .Net and even JavaScript

In face of Java's popularity and its common use inside of distributed systems we thought etcd should also be available as a backend for Java development. Enter jetcd, the new etcd client that brings the etcd v3 API to Java.

With jetcd, Java applications can cleanly interact with etcd using a smart API wrapping etcd’s native gRPC protocol. This API provides expressive distributed features available only on etcd. What's more, by directly supporting more languages, it becomes easier to write new applications for etcd with new usage patterns, helping etcd become more stable and reliable.

Getting started

You can try out jetcd by building and running a small example program, jetcdctl, which uses jetcd to access etcd. The jetcdctl example is also a good starting point for further jetcd projects. To follow along, you'll need to have both Git and Java installed.

First, get the jetcd source by cloning the jetcd repository, then build the jetcd-simple-ctl package using the included Maven script:

This demonstrates jetcd’s basic functionality by getting and putting keys. Now let's take a closer look at writing code using jetcd.

Better watches

The jetcd API conveniently manages etcd’s underlying gRPC protocol. One example is streaming key events, where the client watches a key and etcd continuously sends back updates. The jetcd client manages a low level gRPC stream, gracefully handles disconnects, and presents a seamless event stream back to the user.

If a jetcd application wishes to receive all changes to a key, it creates a Watcher using the watch API:

Watcher watch(ByteSequence key)

The Watcher’s listen method reads WatchResponse messages from etcd. Each WatchResponse contains the newest sequence of events on the watched key. If there aren’t any events, listen blocks until there’s an update. The listen method is reliable; it drops no events between calls, even in case of disconnect:

WatchResponse listen() throws InterruptedException

All together, the client creates a Watcher then uses listen to wait for events. Here’s the code to watch on a key abc, printing the key and values until listen throws an exception:

Contrast this behavior with ZooKeeper, the Apache Foundation's etcd equivalent. As of ZooKeeper 3.4.10, watches are one-time triggers, meaning once a watch event is received, you must set a new watch to be notified of future changes. To stream key events, the client must contact the cluster to register a new watcher for each new event.

To continually print a key’s content as it updates, a ZooKeeper application first creates a Watcher to listen for WatchedEvent messages. The Watcher implements an event callback method process that is called when the key changes. To register interest in events, the Watcher attaches to the exists method, which fetches key metadata if there is any. When the key changes, the watcher’s process method calls getData to retrieve the key’s value, then registers the same Watcher again to receive future changes, as shown below:

Unlike the jetcd example, the ZooKeeper code cannot guarantee that it observes all changes because there is latency between when the Watcher receives an event and sending a request to get a new watch. For example, an event arrives between executing process and calling exists to register a new Watcher. Since no Watcher is registered, the event is never delivered and is lost.

Even assuming all events are delivered, the code can still corrupt the event stream. Without multi-version concurrency control like etcd offers, there’s no way to access historical keys. If the key value changes between receiving the event and getting the data, the code will print the newest value, not the value associated with the watch event. Worse, events have no attached revision information; there is no way to determine whether the value is from the event or the future.

Version 0.0.1 and beyond

As of v0.0.1, jetcd supports the primitives most applications need from a key-value store. These primitives can serve as building blocks for sophisticated patterns such as distributed queues, barriers, and more. In the future, jetcd will be able to use etcd’s native lock and leader election RPCs for cluster-wide standardized distributed coordination.

jetcd is designed to be simple to use while taking advantage of etcd’s advanced features under the hood. It is open source and under active development, and contributions and feedback from the community are always welcome. Find it on GitHub at https://github.com/coreos/jetcd.