How does it work? Docker! Part 4: Control your Swarm!

TL;DR

I hacked another thing together, this time in order to install a highly available Docker Swarm cluster on CoreOS (yeah, Container Linux), using Ansible.

The whole subject was way too long for a single article. Therefore, I’ve divided it into 5 parts. This is episode 4, regarding the actual implementation of the local cluster’s Manager nodes, using Vagrant, CoreOS and Ansible.

Code, please!

Yay, code!

First of all, we will need the actual virtual machines where our cluster will run. We will create these using Vagrant. If you don’t know what Vagrant is, or why are we using it, check out this article.

For this deployment, we will have three Manager nodes, and three worker nodes. Why three Manager nodes you say? It may seem overkill, but in order to have High Availability you need to have an odd number of Manager nodes, otherwise, you will not get consensus from Raft. We will see this in action later on.

We do not need an etcd cluster, since the key-value datastore is already included in the internals of the Docker Engine, and therefore, we will not include any machines for it.

First of all, I’ll describe the amount of machines I want and their configurations as variables:

Just a little reminder, Container Linux is a lightweight Linux distribution that uses container to run applications. It ships with basic GNU utilities so you can do all your business, and some other interesting things, like kubelet, Docker, etcd and flannel. We’ll only be using Docker for this part. In short, CoreOS’s Container Linux is an OS specially designed to run containers, and we’re going to profit from that in our context. If you wanna know more about CoreOS, check this article.

Easy. If you wanna take a look at the Vagrantfile, it’s right here. Moving on.

Next up, I’ll set up a Makefile with a Vagrant target, which will create all the virtual machines, and generate a configuration file with the SSH configuration needed to interact with them. Ill also add a clean target, which will destroy all the machines, and it will delete the SSH configuration file, if it exists.

You will probably notice there is a `swarm-leader` group in the inventory, which contains a single host. Like I said in the first article, there might be many Managers in a cluster; Nevertheless, there is only one Leader at any given moment. We will use this group to launch specific actions for the Leader, common actions for all Manager nodes using the swarm-manager group, and actions destined for the non-Leader Manager nodes using the swarm-manager group, and subtracting the swarm-leader group from it. This may seem complex, but it is actually super easy, you will see.

No IP configurations over here, we’ll just use the SSH configuration file we generated earlier. In order to do that, we have to specify it on our ansible.cfg file. No ansible.cfg file? Just create it:

[defaults]
ansible_managed = Please do not modify this file directly as it is managed by Ansible and could be overwritten.
retry_files_enabled = false
remote_user = core
[ssh_connection]
ssh_args = -F ssh.config

We’ll also disable retry_files, and specify that we want to use the “core” user when connecting to the machines using SSH.

I’ve already said this before, but CoreOS only ships with basic GNU utilities, which means no Python. And no Python means no Ansible, except for the raw module, the script module and the synchronize module. What we’re going to do is that we’re going to install a lightweight Python implementation called PyPy using only those modules, and then use that Python implementation in order to execute the rest of our playbook. Neat huh?

So basically, we’ve got a role now under roles/bootstrap/ansible-bootstrap, which has 3 files under the tasks directory: main.yml, configure.yml and test.yml. The configure.yml file holds all the tasks necessary in order to install PyPy. The test.yml file verifies if Python is correctly installed by doing `python –version`. The main.yml file wraps these two files, adding the `test` tag to the test.yml part:

I’ll follow this approach for each role on this project, so each role will have a smoketest, which will be enough to tell us if the component is correctly installed. This is pretty useful in order to test the already deployed infrastructure, as a conformance test, and check for deltas which might need to be corrected.

Now that we have our first role, it’s time to create a playbook. Since we’ll be deploying a Swarm cluster, I’ll just name it swam.yml:

It’s quite straightforward so far, I’ll just launch the recently created role on each hosts, without gathering facts, since Python is not yet installed on the machines. Facts will be gathered at the end of the role though, as seen on the previous code snippet.

Next up, tests. We’ll use molecule for the win. I spoke to you all about molecule on a previous article. It is basically a testing tool for Ansible code. It creates ephemeral infrastructure (either virtual machines or containers), tests your roles on it (not only the execution of the roles, but also the syntax and their idempotence), and then it destroys it. Since there are no CoreOS containers, and Virtualbox virtual machines through Vagrant being the target platform, I’ll just use the Vagrant driver.

In order to test with molecule, I’m going to create a molecule.yml file, in which I’m going to define the Ansible files to use for the test, as well as the Vagrant machine’s specification and configuration.

First, I’ll specify which Ansible configuration to use, and which playbook to run:

With that in place, I just need to run `molecule test` in order to test that my infrastructure is created and configured correctly. This is actually an oversimplification of everything that can be done using molecule, but since I already wrote about it on a previous article, just can just head there and read about it if you’re really interested.

The smoketest target allows me to run a conformance test on all the already deployed infrastructure, to check for deltas and see if something’s wrong whenever I want, and the test target allows me to test the code on fresh, newly created infrastructure, and to check for Ansible-specific good practices. Remember, this uses Molecule V1, so if you try to run it using Molecule V2 it will probably not work.

Tests are set up thusly. Moving on.

Manage: Lead and follow

Next up, we need to setup the three Manager nodes. I’ll start by creating a swarm-leader role, under the configuration roles directory:

The configure file first checks if the cluster is already on Swarm mode. If it is, it doesn’t do anything else. If it isn’t, it creates the first Swarm node, creating thus the Swarm cluster, which will be joined by the subsequent nodes. It also disables scheduling on the Leader, making sure that the Leader does not handle any workload and that it concentrates its resources on leading the cluster:

This last part is not actually necessary, specially for small clusters. Nevertheless it is usually a good practice, since the leader election process can be really intensive in terms of resource consumption. The `disable_leader_scheduling` variable is defined on the role’s defaults, and you can override it if you want your Leader to handle workloads.

Fairly simple. Notice the `changed_when: false` parameter on the first command task. It is there because running `docker info` will not change the state of the cluster, and it is therefore not a real action, just a way of collecting information.

Next, for the smoketest, I’ll verify if the created Manager node is in fact a Leader (which it should be, since it was the first Manager node to be created), and whether its status is “Drain”, since the Leader node is not supposed to handle any workload:

Using the host group we discussed earlier, and the proper tag in order to identify the action. And that’s it for the Manager Leader. We need some non-Manager Leader for that High Availability though!

So we’ll just repeat the previous process, we’ll create a swarm-manager role up next, with the same structure of the previous role (main.yml, configure.yml, test.yml).

I won’t show you the main.yml: it is basically the same one we saw before. The configure.yml file, on the other hand, checks if Swarm mode is activated on the node, the same way the Leader role does, but if it isn’t, it recovers the token needed to join the cluster as a Manager node from the Leader node, and joins the cluster with it. If Swarm mode is already activated, it does nothing:

Notice the `delegate_to` option on the token recovery task. This needs to be done because the token must be recovered from the Leader node and the Leader node only. Scheduling is also disabled on these nodes, by default, because of the reason specified above on the Leader node. This time, the `disable_manager_scheduling` variable is also defined on the role’s defaults. You can override this variable if you want your Managers to handle workloads.

It recovers the nodes information, and then it verifies that the node Manager type is `Reachable` rather than `Leader`, as it was for the Leader node. It also verifies that the nodes are “drained” since we don’t want them to run containers.

Notice the `!` sign on the hosts part of the play. This specifies that we want to run the role on every node on the swarm-manager group, that isn’t in the swarm-leader group, thus preventing the Leader node to try to join the cluster as a non-Leader Manager. Sweet!

Once we finish all this, we should have everything we need Manager-wise. Time to get some Workers running! I’ll probably talk to you about that on the next article though.