Create Custom Azure Kubernetes Clusters with ACS Engine

If your looking to run Kubernetes in Azure then the Azure Kubernetes Service is what you’ll generally be looking at, it offers a semi-managed solution with PaaS based management nodes and is by far the simplest and easiest Kubernetes service to use in Azure. If AKS meets all your needs, then no need to read any further, but for some use cases, it may not. AKS is a new service, it’s not gone GA quite yet, and it is still missing some features like the ability to run Windows Containers, support for certain versions of Kubernetes, or maybe you don’t want to run Kubernetes at all and want Swarm or Mesos. In this scenario, you could look at using ACS, the old version of Azures container service, but this is going to go end of life sometime in the next year. This is where ACS engine comes in handy.

ACS engine is very different from ACS. ACS Engine is the tool that is used to create both ACS and AKS and has been open sourced by Microsoft. ACS Engine is an application (supported on Windows, Linux and Mac) which you can feed your desired cluster configuration to, in the form of a JSON file. Once you do that it generates Azure Resource Manager templates, and ancillary files (certificates, configs etc.) which you can then run to deploy a Kubernetes cluster into Azure exactly matching your specification. Now to be very clear, what you are implementing here is an IaaS based entirely unmanaged solution, you will need to maintain all the servers. What it provides is a way to deploy the whole cluster with a single ARM command. If you need a custom cluster design, then this is the way to go.

So how does it work? For the rest of this article, we are going to go through an example where we want to deploy a Kubernetes cluster that supports both Windows and Linux worker nodes, something that is today not possible in AKS. There are many different scenarios for using ACS Engine that we won’t cover today; you can see details on these here.

Installing ACS Engine

There’s no install for ACS engine; you need to download the latest release from here for the client OS of your choice. Once downloaded, extract it to a location of your choice. To make it easier to execute, you can add it to your path.

Creating your ACS Engine Template

To be able to create our cluster we first need to create a template json file which provides all the information we need to generate the cluster ARM templates. There are lots of example templates here, as well as many other places around the web; we are going to use this template as our starting point.

The first thing we need to do is fill in some of the missing data in the template,

Windows Username and Password

SSH Public Key for the Linux nodes – If you don’t have one already, you can follow this article to create one.

Azure Service Principal ID and Secret – This is a service account created in Azure AD that will be used to provide an identity to the cluster, for things like accessing Azure Container Registry. If your not familiar with the steps to create this you can follow this article.

I’m also going to amend it to specify the version of Kubernetes I want to use and to make sure that my VMs are using managed disks. The final version of the template looks like this:

I’m not specifying the release of Linux for Windows I want to use in this template, so it will take the defaults. If I wanted a specific image on the machines I could add an ImageReference property for the Linux image, or ImageVersion for the Windows image. The full list of parameters you can use in this file is here.

Compiling the Template

Now we have a template defined we can use ACS engine to compile it. When we do this ACS Engine will read this template and generate an ARM template to deploy your cluster. It will also create other supporting files like SSH keys, KubeCtl credentials etc. To compile you need to run the ACS-Engine with the generate option and point it at your files.

ACS-Engine Generate c:\ClusterFiles\MixedCluster.json

Once you run this, ACS Engine will create a folder called _Output in the folder where your JSON file sits, and under here is a folder with the same name as the master DNS prefix you entered in the json file. Inside here you will see your ARM template and parameter file, along with keys and other generated data.

Create the Resource Group

Before you deploy the template, you need to create the resource group you will deploy to. You can do this in the portal or using PowerShell CLI, but one thing you need to remember is to grant the service principal you created earlier contributor rights on that resource group. You must do this before deploying the cluster. If you don’t do this, then things will deploy OK, but when you come to create resources in Kubernetes things like Load Balancers or Persistent Volume Claims will not be constructed properly.

Deploying the Template

Assuming your template compiled as expected you can now deploy the cluster. This is the same as deploying any ARM template to Azure:

This should submit your deployment with no errors; you will then need to wait around 20-30 minutes for the cluster to deploy.

Connecting to the Cluster

Once your deployment completes, you can then connect to the cluster using KubeCtl and start working with Kubernetes. First thing, you need to install KubeCtl, downloads for specified OS’s can be found here:

Once you have Kubectl downloaded and installed, you need to provide a Kube config file to connect. You can download this by SSHing into the master node, but there is an easier way. When you compiled your template with ACS engine it generated these files for you; they can be found in _output<instance name>\kubeconfig. There is a file for each Azure region, locate the one for the region you deployed to.

To get Kubectl to use that file, you can point your KUBECONFIG environment variable at that folder. Once you do that and restart your cmd line, you should be able to run “Kubectl get nodes” and get a list of the nodes in the cluster. Once that works, you are ready to go.

Issues with Logs, Exec and Port-Forward on Windows Nodes

I have found an issue where, by default, getting logs or trying to run Exec of Port-Forward to a Windows node failed. This is because port 10250 is not open on the Windows firewall. To fix this, you will need to temporarily give your Windows node a public IP and then RDP to it or connect with a remote PowerShell session. Once you are on the machine, launch PowerShell and run this command: