Kubernetes, Ingress controllers and Traefik

When running your application services on top of an orchestration tool like Kubernetes or Mesos with Marathon there are some common necessities you’ll need to satisfy. Your application will usually contain two types of services, those that should be visible only from inside of the cluster, and those that you want to expose to the external world, outside your cluster and maybe to the internet (e.g frontends).

This article will focus on how to approach this on Kubernetes.

You can make use of the different service types that Kubernetes makes available for you when creating a new service in order to achieve what you want.

ClusterIP: This is the default. Choosing this value means that you want this service to be reachable only from inside of the cluster.

ExternalName: It serves as a way to return an alias to an external service residing outside the cluster.

NodePort: Expose the service on a port on each node of the cluster.

LoadBalancer: on top of having a cluster-internal IP and exposing service on a NodePort, also ask the cloud provider for a load balancer which forwards requests to the Service exposed as a <NodeIP>:NodePort for each Node. If the cloud provider does not support the feature, the field will be ignored.

So, if your cloud does not support “loadBalancer” (e.g. you run an on-premise private cloud), and you need something more sophisticated than exposing a port on every node of the cluster, then it used to be that you’d need to build your own custom solution. Fortunately this is not true anymore.

You can also ask Kuwit “How can I expose services to the external world?” whenever you need to remember this ;-)

An Ingress is a collection of rules that allow inbound connections to reach the cluster services. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, offer name based virtual hosting, and other useful configuration. Users request Ingress by POSTing the Ingress resource to the API server.

In order for the Ingress resource to work, the cluster must have an Ingress controller running. The Ingress controller is responsible for fulfilling the Ingress dynamically by watching the ApiServer’s /ingresses endpoint.

This is handy! Now you could go even further, isolate at the infra level where your Ingress controller runs and think of it as an “edge router” that enforces the firewall policy for your cluster. The picture for a High Available Kubernetes Cluster would look something like this:

We’ll show how to use Traefik for this purpose. Traefik is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It supports several backends among Mesos/Marathon and Kubernetes to manage its configuration automatically and dynamically.

We’ll deploy a Kubernetes cluster similar to the picture above and will run Traefik as DaemonSet.

Our edge-router will be just another Kubernetes node with some restrictions.

We don’t want any other pod to be scheduled to this node so we set --register-schedulable=false when running the kubelet as well as giving it a convenient label: --node-labels=edge-router.

Kubernetes will run DaemonSets on every node of the cluster even if they are non-schedulable. We only want this DaemonSet to run on the edge-router node so we use “nodeSelector” to match the label we previously added.

nodeSelector:
role: edge-router

Notice that with this approach, if you want to add a new edge-router to the cluster, all you need to do is spin up a new node with that label and a new DaemonSet will be automatically scheduled to that machine. Nice!

Here is a video demo of all this in action using two different clouds (DigitalOcean and AWS), deploying two Kubernetes clusters from scratch:

Recently I’ve seen a lot of users on Kubernetes slack with issues communicating to the Ingress controller. This is often due to a known problem.