Coordination and service discovery with Apache Zookeeper

Service-oriented design has proven to be a successful solution for a huge variety of different distributed systems. When used properly, it has a lot of benefits. But as number of services grows, it becomes more difficult to understand what is deployed and where. And because we are building reliable and highly-available systems, yet another question to ask: how many instances of each service are currently available?

In today’s post I would like to introduce you to the world of Apache ZooKeeper – a highly reliable distributed coordination service. The number of features ZooKeeper provides is just astonishing so let us start with very simple problem to solve: we have a stateless JAX-RS service which we deploy across as many JVMs/hosts as we want. The clients of this service should be able to auto-discover all available instances and just pick one of them (or all) to perform a REST call.

Sounds like a very interesting challenge. There could be many ways to solve it but let me choose Apache ZooKeeper for that. The first step is to download Apache ZooKeeper (the current stable version at the moment of writing is 3.4.5) and unpack it. Next, we need to create a configuration file. The simple way to do that is by copying conf/zoo_sample.cfg to conf/zoo.cfg. To run, just execute:

Windows: bin/zkServer.cmd
Linux: bin/zkServer

Excellent, now Apache ZooKeeper is up and running, listening on port 2181 (default). Apache ZooKeeper itself worth of a book to explain its capabilities. But brief overview gives a very high-level picture, enough to get us started.

Now, let’s do some code! We are developing simple JAX-RS 2.0 service which returns list of people. As it will be stateless, we are able to run many instances within single host or multiple hosts, depending on system load for example. The awesome Apache CXF and Spring Framework will backed our implementation. Below is the code snippet for PeopleRestService:

Very basic and naive implementation. Method init is empty by intention, it will be very helpful quite soon. Also, let us assume that every JAX-RS 2.0 service we’re developing does support some notion of versioning, the class RestServiceDetails serves this purpose:

Nice, at this moment the boring part is over. But where Apache ZooKeeper and service discovery fit into this picture? Here is the answer: whenever new PeopleRestService service instance is deployed, it publishes (or registers) itself into Apache ZooKeeper registry, including the URL it’s accessible at and service version it hosts. The clients can query Apache ZooKeeper in order to get the list of all available services and call them. The only thing services and their clients need to know is where Apache ZooKeeper is running. As I am deploying everything on my local machine, my instance is on localhost. Let’s add this constant to the AppConfig class:

private static final String ZK_HOST = "localhost";

Every client maintains the persistent connection to the Apache ZooKeeper server. Whenever client dies, the connection goes down as well and Apache ZooKeeper can make a decision about availability of this particular client. To connect to Apache ZooKeeper, we have to create an instance of CuratorFramework class:

Next step is to create an instance of ServiceDiscovery class which will allow to publish service information for discovery into Apache ZooKeeper using just created CuratorFramework instance (we also would like to submit RestServiceDetails as additional metadata along with every service registration):

Internally, Apache ZooKeeper stores all its data as hierarchical namespace, much like standard file system does. The services path will be the base (root) path for all our services. Every service also needs to figure out which host and port it’s running. We can do that by building URI specification which is included into JaxRsApiApplication class (the {port} and {scheme} will be resolved by Curator framework at the moment of service registration):

From Apache ZooKeeper it might look like this (please notice that session UUIDs will be different):

Having two service instances up and running, let’s try to consume them. From service client prospective, the first step is exactly the same: instances of CuratorFramework and ServiceDiscovery should be created (configuration class ClientConfig declares those beans), in the way we have done it above, no changes required. But instead of registering service, we will query the available ones:

Once service instances are retrieved, the REST call is being made (using awesome JAX-RS 2.0 client API) and additionally the service version is being asked (as payload contains instance of RestServiceDetails class). Let’s build and run our client against two instances we have deployed previously:

If we stop one or all instances, they will disappear from Apache ZooKeeper registry. The same applies if any instance crashes or becomes unresponsive.

Excellent! I guess we achieved our goal using such a great and powerful tool as Apache ZooKeeper. Thanks to its developers as well as to Curator guys for making it so easy to use Apache ZooKeeper in your applications. We have just scratched the surface of what is possible to accomplish by using Apache ZooKeeper, I strongly encourage everyone to explore its capabilities (distributed locks, caches, counters, queues, …).

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial3>

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

and many more ....

5 comments

Thank you for great post. This is exactly what I have been looking for. At my work I need to implement thrift rpc service that will be called by Php client. I’m planning to use zookeeper for auto discovery of my service. Do you know if it is possible to read list of hosts from php? Cheers Jarek

Good post, but I am just wondering in an enterprise situation would you not just use a VIP and add each server onto it. I know you don’t have auto discoverability and each instance would need its own ip, but are there other benifits of using zookeeper in this instance?

Thank you for your feedback. If you think about servers, you have a very good point: ZooKeeper might be not a right solution for that. But if you think about services, it’s different as one server can host dozens of them (if not hundreds) and some discovery mechanism could help a lot. Also, ZooKeeper is extremely feature-rich piece of software: it allows to do much, much more, one small post is not enough to demonstrate all its useful capabilities. As you could notice, I have tried to come up with simple but good enough example so to keep post reasonably short.

Newsletter

Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

Email address:

Recent Jobs

No job listings found.

Join Us

With 1,240,600 monthly unique visitors and over 500 authors we are placed among the top Java related sites around. Constantly being on the lookout for partners; we encourage you to join us. So If you have a blog with unique and interesting content then you should check out our JCG partners program. You can also be a guest writer for Java Code Geeks and hone your writing skills!

Disclaimer

All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners. Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. Examples Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.