How to stabilize Calico's IP-in-IP tunnels in virtual environments

When you work with bleeding edge technology you can expect the unexpected

As most of us know software is never without bugs and due to technical
diversity most of us won’t be able to fix these bugs ourselves. Instead we can
develop and deploy workarounds while we wait for specialists to release a fix.

In this post I’d like to share how we resolved a connectivity issue between pods in our Kubernetes cluster due to a tunnelling issue in Calico.

Public Cloud

At Fuga CLoud we run a public cloud based on the free and open-source cloud computing platform OpenStack. Initially we built this public cloud for our users to setup and manage their own infrastructure and there has been an internal company demand for a similar service. We eat our own dog food.

Currently we’re working on a continuous deployment pipeline to run OpenStack in containers. For fast iterations we deploy to virtual hardware on our own public cloud. The containers are orchestrated by Kubernetes and intern-container connectivity is handled by Calico. Because we run on virtual hardware we use Calico’s IP in IP tunnelling.

IP in IP is an IP tunnelling protocol that encapsulates one IP packet in another IP packet. (source wikipedia.org)

Problem

The problem we faced using Calico IP in IP tunnels in a virtual environment was that Kubernetes pods sometimes couldn’t connect to one another during initialisation phase. Somehow these IP-in-IP tunnels between pods weren’t properly initialised causing the pods to get stuck in a crash loop. During troubleshooting and many deployment runs we discovered sending ICMP packets from cross origin pods within the Kubernetes cluster resolved the IP in IP network issues we were having.

The successrate of our continuous deployments went from 60% to 100%.

Workaround

Our current workaround is deploying a pod on each Kubernetes node which sends a single ICMP packet to each pod in the cluster. To deploy these pods we used some core features of Kubernetes narrowing the workaround down to a single configuration file containing no more than 30 lines.

Break-it-down

Kubernetes configuration can be written in manifest files with a YAML or JSON format. The first three fields in the following excerpt are required for all Kubernetes configurations, apiVersion, kind and metadata. The manifest complies with the defined API version, we want to deploy a DaemonSet resource in the kube-system namespace.

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod (source: kubernetes.io)

The image we want use as our container will be busybox. This image contains all the basic Unix tools we need to execute a shell script and ping other pods. The busybox image is pre-installed with Kubernetes which frees us from building and registering our own container image.

The busybox container runs a single process, configured in the command field. This process loops through all the running pods in the Kubernetes cluster and sends them a ICMP packet to jumpstart the IP in IP tunnel configured through Calico.

To know which pods to send ICMP packets we need to query the Kubernetes API with the Kubernetes client kubectl. We could install this client in our container but because it is pre-installed on all our Kubernetes nodes we simply mount the parent directory including the binary on the host in our container. An additional benefit is that the Kubernetes client will now always match the Kubernetes API.

By default the Kubernetes client kubectl returns output in a human readable format and it supports machine readable formats like YAML and JSON. Another option is to use the built-in templating system with which you can do insane things like build shell scripts. The following excerpt loops through the pods and filters pods which have an IP address and are not in the kube-system namespace.