Blogroll

Comments Off on Reconfiguring Mesos Agents (Slaves) with new resources

Problem:

You want to add new resources to a Mesos Agent. Maybe you want to open new ports or restrict the number of CPUs, etc. When restarting the Mesos Agent you get an error like “Failed to perform recovery: Incompatible slave info detected.”

By default Mesos Agents ( as of Mesos 0.23 ) tries to recover the state using a”strict” flag. If strict=true, any and all recovery errors are considered fatal.

Recovery is a nice thing to have and it’s comforting to know that if Mesos Agent restarts things resume from a known state.

Solution:

When the state of the Mesos Agents does not matter, then one way to solve the problem is either to restart Mesos Agent with the “strict” flag set to false, or to clear the state and start fresh, also killing any running docker processes. To achieve the latter you can issue:

Resources:

I was actually wondering when this problem would hit my local box on OS X and I though to share one solution in case it’s useful for anyone searching for a solution. The problem started when building a new docker image and I was getting the “no space left on device” message.

Problem:

Monitor resource utilisation of Docker containers in a Mesos cluster. This is useful when deciding how much CPU and Memory to give to each container or for understanding when to scale up / down.

Solution:

cAdvisor is a simple to use monitoring tool for Docker containers. It provides a Docker container ready to run on each of the Mesos slaves.

With Marathon and Mesos is very easy to deploy a cAdvisor agent on each of the slaves. Marathon allows you to define constraints to make sure you can distribute the cAdvisor container evenly across all the Mesos slaves.

Bellow is the body of the HTTP POST request to be made to Marathon and deploy cAdvisor.