Monthly Archives: January 2015

I have been working with container technology since September 2014, sorting out how they are useful in the context of OpenStack. This led to my involvement in the Kolla project, a project to containerize OpenStack as well as Magnum, a project to provide containers as a service. Containers are super useful as an upgrade tool for OpenStack, and the main topic of this blog post.

Kolla began life as a project with dependencies on docker and kubernetes. I wasn’t always certain the kubernetes dependency was necessary to provide container deployments in OpenStack, but I went with it. Over time, we found kubernetes has a lot to offer OpenStack deployments. But it lacks a few features which make it unsuitable to deploy “super privileged containers”.

A super privileged container is a container where one or more of the following are true:

Kubernetes could be modified to allow super-privileged containers, but until that day comes, Kubernetes won’t be suitable for running super-privileged containers. There is no way to do these things with existing Kubernetes pod files, however, because they have runtime and privilege considerations – essentially they assume the operator trusts the application running in super-privileged mode with the possibility of rooting their entire datacenter. The kubernetes maintainers have been unwilling to make these options available I suspect because of this concern.

I have spent several weeks researching upgrade of the compute node in nova-networking mode, which consists of a nova-network, nova-compute, and nova-libvirt process. I started by borrowing the Kolla containers for nova-network and nova-compute and cloned them into a new compute-upgrade repo:

Most of the hard work of this project was building the containers. Half way to victory using the cp command 🙂 Next I sorted out a run command that would run the various containers. I merged the 3 run commands into a script called start-compute.

First, a few directories must be shared for nova-libvirt:

/sys: To allow libvirt to communicate with systemd in the host process

/sys/fs/cgroup: To allow libvirt to share cgroup changes with the host process

/var/lib/libvirt: To allow libvirt and nova to share persistent data

/var/lib/nova: To allow libvirt and nova to share persistent data

Second, libvirt must be able to reparent processes to the init (pid=1) systemd process during an upgrade. If it can’t do that operation, the libvirt qemu processes will have no parent during an upgrade. Who would be their parent during an upgrade process, where libvirt had been killed? The answer lies in a brand-new docker feature allowing host namespace PID sharing. In order to gain this super-privilege, the –pid=host flag must be used.

Third, nova-network, nova-libvirt, and nova-compute must share the host network namespace. To obtain access to this super-privilege, the docker –pid=host operation must be used.

Finally some non-privileged environment variables must be passed to the container using the -e flag. A combination of these flags results in the following launch command:

My testbed is a two node Fedora 21 cluster. One node runs devstack in nova-network mode. The remaining node simulates a compute node by running the containers produced in this repository with minimal other operating system services running. Note ebtables must be modprobed on the compute node in the host OS and libvirt must be disabled.

Ok, so you just showed stopping and starting a container? where is the atomic part? Any container of OpenStack compute can be atomically upgraded as follows:

docker pull (to obtain new image)

docker stop

docker start

From the compute infrastructure, it looks like an atomic upgrade. No messy upgrades of a hundreds of RPM or DEB packages. Just replace a running image with a new image.

It is highly likely I will re-integrate this work into Kolla, since Kolla is the home for R&D related to launching OpenStack within containers. Unfortunately until kubernetes grows the required features, it is unsuitable for a deployment system for OpenStack compute nodes.