AWS and Rancher: Building a Resilient Stack

In my prior posts, I’ve written about how to ensure a highly resilient workloads using Docker, Rancher, and various open source tools. For this post, I will build on this prior knowledge, and to setup an AWS infrastructure for Rancher with some commonly used tools. If you check out the repository here, you should be able to follow along and setup the same infrastructure.

The final output of our AWS infrastructure will look like the following picture:

In case you missed the prior posts, they’re available on the Rancher blog and cover some reliability talking points. Lets use those learning and create a running stack.

Host VM Creation

The sections we will build are the three lower yellow section:

Golden Image

First, we will need a solution to create Docker hosts that use a reliable combination of storage drivers and OS. We would also like to replace these with different parts in the future.

So we build our base VM, or the “golden image” as it is more commonly referred to. As for the tools, Packer will be used to communicate with the AWS API for creating VM images (and various other cloud providers). Ansible will be used to describe the provisioning steps in a readable manner. The full source can be found here, if you want to jump ahead.

Since the previous chain of posts on reliability used Ubuntu 14.04, our example will provision a VM with Ubuntu 14.04 using AUFS3 for the Docker storage driver.

To start, we create a Packer configuration called ubuntu_1404_aufs3.json. In this case, my config searches for the AMI ID for most recent 14.04 AMI ID on AWS us-east through source_ami_filter, which as of writing returns ami-af22d9b9.

It also creates a 40GB drive attached as /dev/sdb, which we will use to store Docker data;
We are using Docker 1.12.3, because it is supported in the latest Rancher’s compatibility matrix.

Great! It passes validation, but if we actually ran it, Packer would just create a copy of the base AMI with a 40GB drive attached, which isn’t very helpful. To make it useful, we will also need to provision Docker on it. Packer has built-in hooks for various configuration management (CM) tools such as Ansible, Chef, and Puppet. In our case, we will use the Ansible provisioner.

Prior to running the tool, we will need to grab the Docker installation role at the root directory containing ubuntu_1404_aufs3.json, and run ansible-galaxy install angstwad.docker_ubuntu -p to download a pre-configured Docker installation role. The popular angstwad.docker_ubuntu role exposes a lot of options for Docker installation on Ubuntu and follows the official Docker installation tutorial closely.

Finally, we execute the script below and await our new base image. The end result will be your base Docker image going forward.

AWS Infrastructure Creation

To start creating infrastructure components, please checkout the following repository for a Rancher architecture template on AWS

Networking Layer

Next up, most AWS services require setting up a VPC to provision services without errors. To do this, we will create a separate VPC with public subnets. The following provides a straight forward way to setup a standard template. Check out the networking module here.

In main.tf, our entry file for the infrastructure we reference our network configurations from ./database, followed by passing in parameters into our module:

For this walkthrough, we use AWS Certificate Manager (ACM) to manage a SSL cert for our Rancher HA cert. You can look up how to request a free SSL certificate on the ACM docs. The process of requesting a cert from ACM contains manual steps to verify the domain name, so we don’t automate this section. Once provisioned, referencing the SSL certificate is as simple as adding the following data resource, you can view the file on GitHub.

We provision our server node resources inside ./files/userdata.template file. It essentially fills in variables to create a cloud-init config for our instance. The cloud init docs writes a file called start-rancher.sh and then executes it on instance start.

Now you can point your DNS server at our Rancher ELB that we created. Navigate to ELB console from there you should see the created ELB. You then grab the DNS name for the ELB and on your domain name provider, add a CNAME record to it.

For example, in this post, I setup Rancher on rancher.domain.com and then access the admin panel on https://rancher.domain.com.

Rancher Node Setup

At this point, we have already setup the Rancher server and we can add custom hosts or use the Rancher-provided hosts drivers. If we want to try more automation, here is a potential way to automate autoscaled slave node clusters on AWS.

From the Rancher UI, we follow the documentation for adding custom hosts. We will need to grab a few variables to pass into our cluster setup template.

After pulling a those variables, we can then run the node creation step. Since this is a separate process than setting up HA, in the file we initially comment out this the creation of the Rancher nodes.

After a few moments, you should see your Rancher host show up in your Rancher UI.

Summary

That was a lot steps, but with this template, we can now build each Terraform component separately and iterate on the infrastructure layers, kind of like how Docker images are built up.

The nice thing about all these various components is replaceability. If you don’t like the choice of OS for the Docker Host, then you can change up the Packer configurations and update the AMI ID in Terraform. If you don’t like the networking layer, then take a peek at the Terraform script to update it. This setup is just a starter template to get Rancher up and your projects started.

By no means is this the best way to standup Rancher, but the layout of Terraform should allow for continuous improvement as your project takes off.

Additional Improvements

The VPC shown here resides in the public subnet (for simplicity), but if you want to secure the network traffic between the database and servers, you’ll need to update the networking (this would require a rebuild).

We might be able to look into passing our Rancher nodes into a separate project instead of commenting it out.

Also, we should take a look at how to backup state on Terraform in case we lose the folder for state. So a bit more setup into S3 backup would help for those who plan to use this in production.

EFS can also be a candidate into the script to add distributed files system support to our various nodes.

Collection of Reference Architectures

There are many reference architectures from various community members and Rancher contributors that are created by the community. They are further references after testing this template, and you can reference their structures to improve on the infrastructure.

For advanced networking variants, there is also a Cloudformation reference here

Nick Ma is an Infrastructure Engineer who blogs about Rancher and Open Source. You can visit Nick’s blog, CodeSheppard.com, to catch up on practical guides for keeping your services sane and reliable with open-source solutions.

This hands-on guidebook provides a detailed introduction to using Kubernetes. It includes an overview of crucial Kubernetes concepts, guidance on deploying and scaling a multi-service containerized application, and on upgrading and monitoring with Kubernetes and Rancher.