My Life as a Sys Admin

Category Archives: Zookeeper

In my previous blog, i’ve described how to manage Docker cluster using Mesos and Marathon. It was using a single Mesos Master. But for production, we cannot go with a single Mesos master, as it will result in a sinlge point of failure. Mesos supports Multi master via Zookeeper. So in this blog i’m going to explain how to setup a Multi Master Mesos Cluster using Zookeper for Automatic promotion of Mesos master when the current Active Mesos Master fails. This time i’m going to setup 3 Zookeeper services and 2 Mesos Master on a single Vagrant box.

Here, since the 3 zookeeper services are running on the same host, i’ve assigned separate ports. If the zookeeper are running on separate instances, then we can have the same ports for all of the zookeeper nodes. There are two port numbers. The first followers use to connect to the leader, and the second is for leader election.

Now we can start the Zookeeper in Foreground using the ./bin/zkServer.sh start-foreground from each of the 3 zookeeper folders. netstat -nltp | grep 288 will display the port of the zookeeper service which is the current master among the cluster. We can also check the connectivity using the zookeepr client binary available in the zookeeper source folder. Once zookeeper cluster is UP, we can go ahead setting up Mesos.

Setting up Multiple Mesos Masters

Mesossphere team has already built packages for Mesos,Marathon and Deimos. So one of my Mesos master will be from the package. But i also wanted to play with the Mesos Source, so i decided to build Mesos from the source. So my second Mesos Master will be from the scratch.

Now we need to define the Zookeeper cluster details so that Mesos master and slave can connect. Edit /etc/mesos/zk and add zk://localhost:2181,localhost:2182,localhost:2183/mesos. More details about Zookeeper URL is available here. Also More details about installing Mesos from Mesosphere’s Debian package is available in their website.

So now we have one master ready, we can start the Mesos Master, Mesos Slave and Marathon. ANd esnure that they can connect to our Zookeeper cluster properly. We can also see the connection status in the stdout of the zookeeper process as they are running in foreground.

Now building the second Mesos master from scratch. First let’s download the Mesos Source Code.

This will launch a single container. We can check the status of the container via the Marathon UI as well as via RestAPI also. Now comes the critical part for Production, What happens when the master fails. Mesos Documentation says, if the current master fails, in a Multi master Mesos Cluster, Zookeeper will elect a new Master, in our case the second Mesos master service. And Mesos Claims that the running services will not get crashed and the status will be taken by the newly promoted master. In our case, we have a Docker container running as a long running service.

First i’ll stop one of the Mesos master service,

$ service mesos-master stop

Now immidiately on the stdout of the Zookeeper, we can see the logs corresponding to the Election process. Below is the same,

As per the above logs, the second Mesos Master running at port 5054 was promoted as New Mesos Master. This will be reflected to Marathon also. We can go to the Marthon UI and make sure that the Docker process that we have started using the old Mesos master is still running fine. We can even try making some scaling changes and can make sure that the new master can alter the process.

In my testing, i’ve tried stoping each of the service and ensured that only one Mesos master is running and also tried scaling the Apps from one Master service to make sure that the new Master was able to keep track of all these changes. The results were quite promising. Mesos + Marrthon + Docker indeed is killer combo. We can really built a cross vendor independent cluster with scaling capabilites.