Docker containers are treated as ephemeral, specifically when they are managed in a Kubernetes cluster. Starting and restarting in the cluster is done automatically and works as a breeze. Things get more complicated as soon as you decide to keep state in your cluster. This is typically done by attaching volumes into a cluster (writing to a host mounted volume is an absolute no-no, except for experimental purposes).

On AWS you would typically use dynamic PersistantVolumes bound to EBS or EFS. Once you attach a volume its AWS volume ID gets tracked by Kubernetes.

In the output below you can see that the volume ID is aws://eu-central-1c/vol-a559b271b5fca7dfa:

Once that volume has been created you loose the capability to control its encryption status or the ability to change its KMS key.

Why would you want to do such a thing in the first place? Say for example, you created a volume unencrypted, written a lot of application state and now want to encrypt the volume. Or you have encrypted the volumes but now decide to use KMS or CMK and therefore need to change the KMS key of that volume. An simple way would be to create a second volume in Kubernetes and then copy the files over. However, when you do not have access to the applications that are using these volumes, this might not be that easy, specifically if you have hundreds of volumes.

In a pure AWS world, a process exists for this:

detach the volume (by stopping the instance),

create a snapshot,

copy snapshot using the correct encryption method,

create a volume out of snapshot in the correct parameters (specifically AZ) and

attach it to the instance.

For volumes that are attached to nodes (not masters), you can stop the instances. But before you do that, suspend the corresponding ASG (Auto Scaling Group) launch event. Otherwise the ASG will start a new instance and probably mount the EBS volume there.

If you terminate the instances (as you would probably should in an immutable world), you need to make sure that the ASG will start the new ones in the same AZ. I had a case with 2 nodes in 3 AZs and sure enough the round robin ASG would start the new instance in the previously unused AZ. The problem is that EBS volumes are AZ-bound and in that case the volume couldn’t be attached to that new instance since they were in 2 different AZs.

However, this will change the volume ID that is tracked by Kubernetes. Here our beloved kubectl patch will come handy:

You can use the command below to show all nodes that are acting as master on your cluster. This is particularly useful when dealing with kops and some versions of canal networking that (accidentally) manipulate the status of the nodes.

If you have a setup where masters are clearly separated from nodes, no user workloads should be scheduled on that masters (because for example, the security groups will prevent application communication between masters and nodes). In this case, you may want to make sure to check that the NoSchedule effect is applied as a taint:

If you are following the principles above, you will retrovision your bastion host (set autoscaling to 1) every time you need it you will run into an issue: ssh will not connect to the server since you kept the name but the server’s identity has changed. This is a security measure that prevents an intruder to redirect traffic to a new server while keeping the name. You can safely delete the old server’s identity from ssh by using:

If however you want to keep access to your config files separate, you might use a script that is exchanging your .kube/config symbolic link. I organize my clusters with numerically (based on a AWS account number), so I rename each cluster config file to say 0123 and put it in my .kube directory. I then use kubeconf 0123 to activate that config. Here is the kubeconf script (located in /usr/local/bin):

Yes, you should use GitHub for your public and private repos (IMHO), but if you just need a small office solution with a few machines and / or users and already have a Diskstation (RAID config and backup recommended) why not install a git server locally.

Install the git server as described in the Synology guide. Create users in DSM that match your workstation users and are assigned to the group Users. In git server DSM manager assign those users to git. Then use the following steps to create repos on the server (guide assumes diskstation is your dns name of the Synology):

Kops is the most popular solution to install Kubernetes on AWS in a highly-available way. Debian is the preferred Linux distro for kops, which is somewhat annoying if you see that CoreOS is the preferred container Linux.

Moreover, the Debian AMI for kops is custom build, not only the OS itself, but also the kernel. AMIs are marked public, so you can easily reuse them. As soon as you want to encrypt your images, you will need access to the underlying snapshot that is not public at the moment.

You may instantiate an EC2 instance, encrypt the snapshot or you can use kube-deploy to make yourself your own image the same way that kops does.

Clone the kube-deploy repo, configure your aws credential and run the following steps in the kube-deploy folder:

Some pods you might want to run on your master nodes, too. This may be because they are exporting master node metrics or even to save resources: say you want many instances of a specific pod. In that case (and given you are on Kubernetes >= 1.7) you can use tolerations to override NoSchedule taints. Add this to your pod’s spec:

tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master

You may even want your pods to run only on master nodes. Then add this node selector key to your pod spec:

nodeSelector:
node-role.kubernetes.io/master: ""

But be aware that masters may not be able to communicate to nodes, due to your setup (security groups etc).

Posted inDevOps|Taggedkubernetes|Comments Off on Kubernetes: Make Pods Run on Your Master Nodes