Containers are transforming the way people think about virtual machines (LXD) and apps (Docker). They give us much better performance and much better density for virtualisation in LXD, and with Docker, they enable new ways to move applications between dev, test and production. These two aspects of containers – the whole machine container and the process container, are perfectly complementary. You can launch Docker process containers inside LXD machine containers very easily. LXD feels like KVM only faster, Docker feels like the core unit of a PAAS.

The density numbers are pretty staggering. It’s *normal* to run hundreds of containers on a laptop.

And that is what creates one of the real frustrations of the container generation, which is a shortage of easily accessible IP addresses.

It seems weird that in this era of virtual everything that a number is hard to come by. The restrictions are real, however, because AWS restricts artificially the number of IP addresses you can bind to an interface on your VM. You have to buy a bigger VM to get more IP addresses, even if you don’t need extra compute. Also, IPv6 is nowehre to be seen on the clouds, so addresses are more scarce than they need to be in the first place.

So the key problem is that you want to find a way to get tens or hundreds of IP addresses allocated to each VM.

Most workarounds to date have involved “overlay networking”. You make a database in the cloud to track which IP address is attached to which container on each host VM. You then create tunnels between all the hosts so that everything can talk to everything. This works, kinda. It results in a mess of tunnels and much more complex routing than you would otherwise need. It also ruins performance for things like multicast and broadcast, because those are now exploding off through a myriad twisty tunnels, all looking the same.

The Fan is Canonical’s answer to the container networking challenge.

We recognised that container networking is unusual, and quite unlike true software-defined networking, in that the number of containers you want on each host is probably roughly the same. You want to run a couple hundred containers on each VM. You also don’t (in the docker case) want to live migrate them around, you just kill them and start them again elsewhere. Essentially, what you need is an address multiplier – anywhere you have one interface, it would be handy to have 250 of them instead.

So we came up with the “fan”. It’s called that because you can picture it as a fan behind each of your existing IP addresses, with another 250 IP addresses available. Anywhere you have an IP you can make a fan, and every fan gives you 250x the IP addresses. More than that, you can run multiple fans, so each IP address could stand in front of thousands of container IP addresses.

We use standard IPv4 addresses, just like overlays. What we do that’s new is allocate those addresses mathematically, with an algorithmic projection from your existing subnet / network range to the expanded range. That results in a very flat address structure – you get exactly the same number of overlay addresses for each IP address on your network, perfect for a dense container setup.

Because we’re mapping addresses algorithmically, we avoid any need for a database of overlay addresses per host. We can calculate instantly, with no database lookup, the host address for any given container address.

More importantly, we can route to these addresses much more simply, with a single route to the “fan” network on each host, instead of the maze of twisty network tunnels you might have seen with other overlays.

You can expand any network range with any other network range. The main idea, though, is that people will expand a class B range in their VPC with a class A range. Who has a class A range lying about? You do! It turns out that there are a couple of class A networks that are allocated and which publish no routes on the Internet.

We also plan to submit an IETF RFC for the fan, for address expansion. It turns out that “Class E” networking was reserved but never defined, and we’d like to think of that as a new “Expansion” class. There are several class A network addresses reserved for Class E, which won’t work on the Internet itself. While you can use the fan with unused class A addresses (and there are several good candidates for use!) it would be much nicer to do this as part of a standard.

The fan is available on Ubuntu on AWS and soon on other clouds, for your testing and container experiments! Feedback is most welcome while we refine the user experience.

This will map 250 addresses on 241.0.0.0/8 to your 172.16.0.0/16 hosts.

Docker, LXD and Juju integration is just as easy. For docker, edit /etc/default/docker.io, adding:

DOCKER_OPTS=”-d -b fan-10-3-4 –mtu=1480 –iptables=false”

You must then restart docker.io:

sudo service docker.io restart

At this point, a Docker instance started via, e.g.,

docker run -it ubuntu:latest

will be run within the specified fan overlay network.

Enjoy!

This entry was posted
on Monday, June 22nd, 2015 at 10:40 am and is filed under cloud.
You can follow any responses to this entry through the
RSS 2.0 feed.
You can leave a response, or trackback from your own site.