If you are employing cross datacenter or a hybrid cloud infrastructure,
then your containers, services, and endpoints need to be able to
communicate with each other, preferably in the most straightforward and
fastest ways possible. This post showcases some of the options available
to you, and how they compare.

For running Kubernetes, your options vary slightly. AWS offers their EKS service, Digital Ocean has an early access program and a quick launch for CoreOS, and Exoscale has CoreOS templates, so for the near
future, you still need to undertake a reasonable amount of setup and configuration yourself. For an introduction to container orchestration with Kubernetes, Docker Swarm and Mesos with Marathon, we recommend you read our previous blog post that tells you all you need to know.

The Container Network Interface (CNI)

Before diving into options, let’s first look at the specification that
underpins a lot of them, created out of a need to consolidate the
growing number of conflicting options available when containers grew in
popularity. The CNI is a
Cloud Native Computing Foundation project that
defines how other projects should write their container network
interfaces, alongside reference code and CLI tools. The projects included in
this post that conform to CNI, and thus provide a more standard
experience, are Calico, Weave and Cilium, and Kubernetes and Mesos
provide native support.

Container Networking options

Docker overlay network

For those of you using Docker in simpler setups, or with swarm as a container
orchestrator, then the inbuilt overlay
network could be enough for
your needs. The driver creates two networks, an overlay network that
handles control and data traffic for services, and a bridge network that
connects the actual daemons. In addition to the default network(s)
created by swarm, you can create your own, and connect multiple docker
instances to multiple networks.

Kubefed

As the name may imply,
Kubefed
stands for ‘Kubernetes federation’ and is the in-built option for hybrid
cloud Kubernetes orchestration. The Kubernetes project considers it alpha
software, with no clear timeline as to when it will be production ready.
Perhaps that’s why there are so many 3rd party solutions available.

If you do want to try kubefed, you have to download a tarball and step
through a fairly complex series of steps where you need to set up a host
for the ‘control plane’, or instance responsible for routing requests
around the cluster. Discovery happens via DNS provided by AWS, GCE or
any other host that supports CoreDNS services,
which includes Digital Ocean and Exoscale.

Project Calico

With one of the largest support planes,
Calico works with most orchestrators
such as Docker, Kubernetes, Mesos, OpenStack, Rkt, and bare metal Linux
boxes. It also has extensive support for cloud hosting providers that
support any of the standard APIs used by Docker, Kubernetes or Open
Stack. On top of all this, Calico takes a different approach to virtual
networking, using packet routing most of the time and switching to an
overlay network when crossing availability zones between regions or
providers. In theory, this reduces network noise, but naturally, it
depends on your setup.

One caveat is that there is currently some discrepancy between support
and Calico versions, so read the project
documentation to check first.

Cilium

As one of the other more popular choices, Cilium
supports Docker, Kubernetes and Mesos, with its Kubernetes support
offering secure interfaces over HTTP, gRPC and (interestingly and
differently), Kafka. At its core, Cilium uses a relatively new Linux
technology called ‘Berkeley Packet
Filter‘ (BPF)
that is a Kernel level interface, that among other things, is useful for
high speed virtual networks. Aside from the speed improvements, BPF also
Cilium users to make configuration changes without having to update
application code or container configuration.

Cilium supports overlay networks with VXLAN and Geneve encapsulation
formats by default, but supports any other you care to add. As it’s a
Linux first class citizen, it also supports native Linux routing via the
OS routing table, which is more complex, but plays well with existing
setups.

Flannel

Primarily for hybrid cloud kubernetes setups, but with Docker support if
you’re willing to experiment,
flannel is a small agent binary
that runs on each instance in a cluster. Flannel creates an overlay
network by hooking into etcd (more on that later in the post) to access
and store network configurations, forwarding packets via a variety of
methods including UDP, vxlan and some cloud hosting specific routing
options. It is part of CoreOS, so if you are running Kubernetes there,
you can use its Tectonic tool to create
a hybrid cloud cluster that communicates via flannel.

Weave Net

As part of a suite of tools and services aimed at cloud native
infrastructures, Weave Net is one of
the first container-focused virtual networks I ever used and was one of
the few solutions in the early days of Docker. Weave Net’s point of
difference is that unlike other options, it doesn’t need a central
database or store, reducing complexity and overhead. Instead, each node
runs a mini DNS server that find each other across the network by name
and port. It’s available as a plugin for Docker, Kubernetes, and Mesos,
as well as AWS products.

Service Discovery options

With many of the overlay networks, you need an additional central source
of truth that tracks and manages services across the network. These can
add a lot of complexity and noise to networks, and all require separate
installation on each instance, so experiment with settings to get a
configuration as optimal as you can.

Consul

Part of the Hashicorp suite of tools, Consul
provides a key-value store that it mainly uses for configuration, leader
election in a distributed application and coordination, but you can also
use for any other purpose you need. Service discovery happens via HTTP
or DNS, and with built in health-checking, the Consul network routes traffic
instances you deem healthier than others when needed to maintain a
consistent application. AWS has its own template available, but for Exoscale and Digital Ocean, you can install it yourself.

Apache Zookeeper

A long-standing project, Zookeeper
operates via a form of shared hierarchical file system it calls
‘znodes’, and instances and applications in the network read and write
configuration and status information to this structure via an API. It
provides services for naming instances, configuration, cluster
membership and management, and locking mechanisms to prevent race
conditions and data recovery from other distributed systems it manages.
Zookeeper is better integrated into application stacks that use Java or
C, and the broader Apache application landscape (e.g. Solr, Hadoop), but
unofficial bindings exist for other languages too. Zookeeper is part of
the AWS EMR service (designed for data processing), and again, for Digital Ocean and
Exoscale; you can install it yourself.

etcd

As part of the CoreOS project, but also used by Kubernetes as a primary
data store, you’re probably using
etcd without realizing. With
support for the Docker, rkt and container Linux containers as well as
FreeBSD and AWS EC2 instances, etcd provides a distributed key-value
store behind a comprehensive API, and it’s renowned for its speed.
It shares common traits spread across all CoreOS tools, following common
Linux-esque design patterns and using standardized tools to interact
with the service such as gPRC that provide equally standard language
bindings and JSON outputs.

Synapse

One of the lesser known and used options, developed in-house at Airbnb,
Synapse shows that if you have the
resources, creating your own distributed key-value store for service
discovery isn’t an impossible task. Their motivation to create Synapse
was based around a familiar pattern in other options presented in this
Post. Developers have to integrate the services into their
applications, which isn’t always possible with some older enterprise
software. Instead, it leverages routing components likely already
present in your applications such as HAProxy or Nginx to map ports
services use to Synapse and configure your application to use it
instead. Synapse recognizes that you might already use other services
and offers ‘watchers’ for common options such as AWS EC2, Docker, or
Zookeeper.

What does Exoscale's hybrid cloud use?

At Exoscale, our preferred combination of tools is Calico and etcd, as
it offers the widest support base for our customers and the most
familiar experience for existing Linux users. As always with making
recommendations, the final decision is up to you, and you should balance
the options that offer the performance you need alongside the existing
skills of those working on your application. All the options we
presented in this post are open source, with active communities and
plenty of tutorials and case studies available. So whichever option(s)
you choose, you’re in good hands.