Tag Archives: Rolling Update

Making a change to live servers in production is something which has to be done with extreme care and planning. Several deployment types such as blue/green, canary, rolling update are in use today to minimize user impact. Ansible can be used to orchestrate a zero-downtime rolling change to a service.

A typical upgrade of an application, such as patching, might go like this –

disable monitoring alerts for a node

disable or pull out from load balancer

make changes to server

Reboot node

wait for node to be UP and do sanity check

put node back to load balancer

turn on monitoring of node

Rinse and repeat.

Ansible would be a great choice in orchestrating above steps. Let us start with an inventory of web servers, a load balancer and a monitoring node with nagios –

The web servers are running apache2, and we will patch apache and the kernel. For the patch to take effect, the servers need to be recycled. We will perform the patching one node at a time, wait for the node to be healthy and go to the next. The first portion of our playbook would be something like this –

I haven’t included the playbook tasks for disabling/enabling monitoring as well as removing/adding node to the load balancer. The procedures might differ depending on what type of monitoring system or load balancer technology you are using. In addition to this, the sanity check show is a simple port 80 probing, in reality a much more sophisticated validation can be done.