Earlier this month, I spoke at ContainerDays, part of the excellent DevOpsDays series of conferences -- this one in lovely Portland, Oregon.

I gave a live demo of Kubernetes running directly on bare metal. I was running it on an 11-node Ubuntu Orange Box -- but I used the exact same tools Canonical's world class consulting team uses to deploy Kubernetes onto racks of physical machines.

You see, the ability to run Kubernetes on bare metal, behind your firewall is essential to the yin-yang duality of Cloud Native computing. Sometimes, what you need is actually a Native Cloud.

Deploying Kubernetes into virtual machines in the cloud is rather easy, straightforward, with dozens of tools now that can handle that.

But there's only one tool today, that can deploy the exact same Kubernetes to AWS, Azure, GCE, as well as VMware, OpenStack, and bare metal machines. That tools is conjure-up, which acts as a command line front end to several essential Ubuntu tools: MAAS, LXD, and Juju.

I don't know if the presentation was recorded, but I'm happy to share with you my slides for download, and embedded here below. There are a few screenshots within that help convey the demo.

For a long time LXD has supported multiple storage drivers. Users could choose between zfs, btrfs, lvm, or plain directory storage pools but they could only ever use a single storage pool. A frequent feature request was to support not just a single storage pool but multiple storage pools. This way users would for example be able to maintain a zfs storage pool backed by an SSD to be used by very I/O intensive containers and another simple directory based storage pool for other containers. Luckily, this is now possible since LXD gained its own storage management API a few versions back.

Creating storage pools

A new LXD installation comes without any storage pool defined. If you run lxd init LXD will offer to create a storage pool for you. The storage pool created by lxd init will be the default storage pool on which containers are created.

Creating further storage pools

Our client tool makes it really simple to create additional storage pools. In order to create and administer new storage pools you can use the lxc storage command. So if you wanted to create an additional btrfs storage pool on a block device /dev/sdb you would simply use lxc storage create my-btrfs btrfs source=/dev/sdb. But let’s take a look:

Creating containers on the default storage pool

If you started from a fresh install of LXD and created a storage pool via lxd init LXD will use this pool as the default storage pool. That means if you’re doing a lxc launch images:ubuntu/xenial xen1 LXD will create a storage volume for the container’s root filesystem on this storage pool. In our examples we’ve been using my-first-zfs-pool as our default storage pool:

Creating containers on a specific storage pool

But you can also tell lxc launch and lxc init to create a container on a specific storage pool by simply passing the -s argument. For example, if you wanted to create a new container on the my-btrfs storage pool you would do lxc launch images:ubuntu/xenial xen-on-my-btrfs -s my-btrfs:

Creating custom storage volumes

If you need additional space for one of your containers to for example store additional data the new storage API will let you create storage volumes that can be attached to a container. This is as simple as doing lxc storage volume create my-btrfs my-custom-volume:

Attaching custom storage volumes to containers

Of course this feature is only helpful because the storage API let’s you attach those storage volume to containers. To attach a storage volume to a container you can use lxc storage volume attach my-btrfs my-custom-volume xen1 data /opt/my/data:

Sharing custom storage volumes between containers

By default LXD will make an attached storage volume writable by the container it is attached to. This means it will change the ownership of the storage volume to the container’s id mapping. But Storage volumes can also be attached to multiple containers at the same time. This is great for sharing data among multiple containers. However, this comes with a few restrictions. In order for a storage volume to be attached to multiple containers they must all share the same id mapping. Let’s create an additional container xen-isolated that has an isolated id mapping. This means its id mapping will be unique in this LXD instance such that no other container does have the same id mapping. Attaching the same storage volume my-custom-volume to this container will now fail:

But let’s make xen-isolated have the same mapping as xen1 and let’s also rename it to xen2 to reflect that change. Now we can attach my-custom-volume to both xen1 and xen2 without a problem:

Summary

The storage API is a very powerful addition to LXD. It provides a set of essential features that are helpful in dealing with a variety of problems when using containers at scale. This short introducion hopefully gave you an impression on what you can do with it. There will be more to come in the future.

Introduction

As you may know, LXD uses unprivileged containers by default.
The difference between an unprivileged container and a privileged one is whether the root user in the container is the “real” root user (uid 0 at the kernel level).

The way unprivileged containers are created is by taking a set of normal UIDs and GIDs from the host, usually at least 65536 of each (to be POSIX compliant) and mapping those into the container.

The most common example and what most LXD users will end up with by default is a map of 65536 UIDs and GIDs, with a host base id of 100000. This means that root in the container (uid 0) will be mapped to the host uid 100000 and uid 65535 in the container will be mapped to uid 165535 on the host. UID/GID 65536 and higher in the container aren’t mapped and will return an error if you attempt to use them.

From a security point of view, that means that anything which is not owned by the users and groups mapped into the container will be inaccessible. Any such resource will show up as being owned by uid/gid “-1” (rendered as 65534 or nobody/nogroup in userspace). It also means that should there be a way to escape the container, even root in the container would find itself with just as much privileges on the host as a nobody user.

LXD does offer a number of options related to unprivileged configuration:

Increasing the size of the default uid/gid map

Setting up per-container maps

Punching holes into the map to expose host users and groups

Increasing the size of the default map

As mentioned above, in most cases, LXD will have a default map that’s made of 65536 uids/gids.

In most cases you won’t have to change that. There are however a few cases where you may have to:

You need access to uid/gid higher than 65535.
This is most common when using network authentication inside of your containers.

You want to use per-container maps.
In which case you’ll need 65536 available uid/gid per container.

You want to punch some holes in your container’s map and need access to host uids/gids.

The default map is usually controlled by the “shadow” set of utilities and files. On systems where that’s the case, the “/etc/subuid” and “/etc/subgid” files are used to configure those maps.

On systems that do not have a recent enough version of the “shadow” package. LXD will assume that it doesn’t have to share uid/gid ranges with anything else and will therefore assume control of a billion uids and gids, starting at the host uid/gid 100000.

But the common case, is a system with a recent version of shadow.
An example of what the configuration may look like is:

The maps for “lxd” and “root” should always be kept in sync. LXD itself is restricted by the “root” allocation. The “lxd” entry is used to track what needs to be removed if LXD is uninstalled.

Now if you want to increase the size of the map available to LXD. Simply edit both of the files and bump the last value from 65536 to whatever size you need. I tend to bump it to a billion just so I don’t ever have to think about it again:

As you can see, the configured map is logged at LXD startup and can be used to confirm that the reconfiguration worked as expected.

You’ll then need to restart your containers to have them start using your newly expanded map.

Per container maps

Provided that you have a sufficient amount of uid/gid allocated to LXD, you can configure your containers to use their own, non-overlapping allocation of uids and gids.

This can be useful for two reasons:

You are running software which alters kernel resource ulimits.
Those user-specific limits are tied to a kernel uid and will cross container boundaries leading to hard to debug issues where one container can perform an action but all others are then unable to do the same.

You want to know that should there be a way for someone in one of your containers to somehow get access to the host that they still won’t be able to access or interact with any of the other containers.

The main downsides to using this feature are:

It’s somewhat wasteful with using 65536 uids and gids per container.
That being said, you’d still be able to run over 60000 isolated containers before running out of system uids and gids.

It’s effectively impossible to share storage between two isolated containers as everything written by one will be seen as -1 by the other. There is ongoing work around virtual filesystems in the kernel that will eventually let us get rid of that limitation.

The restart step is needed to have LXD remap the entire filesystem of the container to its new map.
Note that this step will take a varying amount of time depending on the number of files in the container and the speed of your storage.

As can be seen above, after restart, the container is shown to have its own map of 65536 uids/gids.

If you want LXD to allocate more than the default 65536 uids/gids to an isolated container, you can bump the size of the allocation with:

Conclusion

User namespaces, the kernel feature that makes those uid/gid mappings possible is a very powerful tool which finally made containers on Linux safe by design. It is however not the easiest thing to wrap your head around and all of that uid/gid map math can quickly become a major issue.

In LXD we’ve tried to expose just enough of those underlying features to be useful to our users while doing the actual mapping math internally. This makes things like the direct user/group mapping above significantly easier than it otherwise would be.

Going forward, we’re very interested in some of the work around uid/gid remapping at the filesystem level, this would let us decouple the on-disk user/group map from that used for processes, making it possible to share data between differently mapped containers and alter the various maps without needing to also remap the entire filesystem.

USB devices in containers

It can be pretty useful to pass USB devices to a container. Be that some measurement equipment in a lab or maybe more commonly, an Android phone or some IoT device that you need to interact with.

Similar to what I wrote recently about GPUs, LXD supports passing USB devices into containers. Again, similarly to the GPU case, what’s actually passed into the container is a Unix character device, in this case, a /dev/bus/usb/ device node.

This restricts USB passthrough to those devices and software which use libusb to interact with them. For devices which use a kernel driver, the module should be installed and loaded on the host, and the resulting character or block device be passed to the container directly.

Note that for this to work, you’ll need LXD 2.5 or higher.

Example (Android debugging)

As an example which quite a lot of people should be able to relate to, lets run a LXD container with the Android debugging tools installed, accessing a USB connected phone.

This would for example allow you to have your app’s build system and CI run inside a container and interact with one or multiple devices connected over USB.

First, plug your phone over USB, make sure it’s unlocked and you have USB debugging enabled:

LXD USB devices support hotplug by default. So unplugging the device and plugging it back on the host will have it removed and re-added to the container.

The “productid” property isn’t required, you can set only the “vendorid” so that any device from that vendor will be automatically attached to the container. This can be very convenient when interacting with a number of similar devices or devices which change productid depending on what mode they’re in.

Conclusion

We are surrounded by a variety of odd USB devices, a good number of which come with possibly dodgy software, requiring a specific version of a specific Linux distribution to work. It’s sometimes hard to accommodate those requirements while keeping a clean and safe environment.

LXD USB device passthrough helps a lot in such cases, so long as the USB device uses a libusb based workflow and doesn’t require a specific kernel driver.

If you want to add a device which does use a kernel driver, locate the /dev node it creates, check if it’s a character or block device and pass that to LXD as a unix-char or unix-block type device.

GPU inside a container

LXD supports GPU passthrough but this is implemented in a very different way than what you would expect from a virtual machine. With containers, rather than passing a raw PCI device and have the container deal with it (which it can’t), we instead have the host setup with all needed drivers and only pass the resulting device nodes to the container.

This post focuses on NVidia and the CUDA toolkit specifically, but LXD’s passthrough feature should work with all other GPUs too. NVidia is just what I happen to have around.

The test system used below is a virtual machine with two NVidia GT 730 cards attached to it. Those are very cheap, low performance GPUs, that have the advantage of existing in low-profile PCI cards that fit fine in one of my servers and don’t require extra power.
For production CUDA workloads, you’ll want something much better than this.

The LXD demo server

Rather than use some javascript simulation of LXD and its client tool, we give our visitors a real root shell using a LXD container with nesting enabled. This environment is using all of LXD’s resource limits as well as a very strict firewall to prevent abuses and offer everyone a great experience.

This is done using lxd-demo-server which can be found at: https://github.com/lxc/lxd-demo-server
The lxd-demo-server is a daemon that offers a public REST API for use from a web browser.
It supports:

Creating containers from an existing container or from a LXD image

Choose what command to execute in the containers on connection

Lets you choose specific profiles to apply to the containers

An API to record user feedback

An API to fetch usage statistics for reporting

A number of resource restrictions:

CPU

Disk quota (if using btrfs or zfs as the LXD storage backend)

Processes

Memory

Number of sessions per IP

Time limit for the session

Total number of concurrent sessions

Requiring the user to read and agree to terms of service

Recording all sessions in a sqlite3 database

A maintenance mode

All of it is configured through a simple yaml configuration file.

Setting up your own

The LXD demo server is now available as a snap package and interacts with the snap version of LXD. To install it on your own system, all you need to do is:

Install the LXD snap

Then configure LXD

ubuntu@djanet:~$ sudo lxd init
Name of the storage backend to use (dir or zfs) [default=zfs]:
Create a new ZFS pool (yes/no) [default=yes]?
Name of the new ZFS pool [default=lxd]:
Would you like to use an existing block device (yes/no) [default=no]?
Size in GB of the new loop device (1GB minimum) [default=43]:
Would you like LXD to be available over the network (yes/no) [default=no]?
Would you like stale cached images to be updated automatically (yes/no) [default=yes]?
Would you like to create a new network bridge (yes/no) [default=yes]?
What should the new bridge be called [default=lxdbr0]?
What IPv4 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
What IPv6 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
LXD has been successfully configured.

And finally install lxd-demo-server itself

At that point, you can hit http://127.0.0.1:8080 and will be greeted with this:

To change the configuration, use:

ubuntu@djanet:~$ sudo lxd-demo-server.configure

And that’s it, you have your own instance of the demo server.

Security

As mentioned at the beginning, the demo server comes with a number of options to prevent users from using all the available resources themselves and bringing the whole thing down.

Those should be tweaked for your particular needs and should also update the total number of concurrent sessions so that you don’t end up over-committing on resources.

On the network side of things, the demo server itself doesn’t do any kind of firewalling or similar network restrictions. If you plan on offering sessions to anyone online, you should make sure that the network which LXD is using is severely restricted and that the host this is running on is also placed in a very restricted part of your network.

Containers handed to strangers should never be using “security.privileged” as that’d be a straight route to getting root privileges on the host. You should also stay away from bind-mounting any part of the host’s filesystem into those containers.

I would also very strongly recommend setting up very frequent security updates on your host and kernel live patching or at least automatic reboot when a new kernel is installed. This should avoid a new kernel security issue from being immediately exploited in your environment.

Conclusion

The LXD demo server was initially written as a quick hack to expose a LXD instance to the Internet so we could let people try LXD online and also offer the upstream team a reliable environment we could have people attempt to reproduce their bugs into.

It’s since grown a bit with new features contributed by users and with improvements we’ve made to the original experience on our website.

We’ve now served over 36000 sessions to over 26000 unique visitors. This has been a great tool for people to try and experience LXD and I hope it will be similarly useful to other projects.

Yesterday, I delivered a talk to a lively audience at ContainerWorld in Santa Clara, California.

If I measured "the most interesting slides" by counting "the number of people who took a picture of the slide", then by far "the most interesting slides" are slides 8-11, which pose an answer the question:

"Should I run my PaaS on top of my IaaS, or my IaaS on top of my PaaS"?

In the Ubuntu world, that answer is super easy -- however you like! At Canonical, we're happy to support:

Introduction

This is finally it! The last blog post in this series of 12 that started almost a year ago.

If you followed the series from the beginning, you should have been using LXD for quite a bit of time now and be pretty familiar with its day to day operation and capabilities.

But what if something goes wrong? What can you do to track down the problem yourself? And if you can’t, what information should you record so that upstream can track down the problem?

And what if you want to fix issues yourself or help improve LXD by implementing the features you need? How do you build, test and contribute to the LXD code base?

Debugging LXD & filing bug reports

LXD log files

/var/log/lxd/lxd.log

This is the main LXD log file. To avoid filling up your disk very quickly, only log messages marked as INFO, WARNING or ERROR are recorded there by default. You can change that behavior by passing “–debug” to the LXD daemon.

/var/log/lxd/CONTAINER/lxc.conf

Whenever you start a container, this file is updated with the configuration that’s passed to LXC.
This shows exactly how the container will be configured, including all its devices, bind-mounts, …

/var/log/lxd/CONTAINER/forkexec.log

This file will contain errors coming from LXC when failing to execute a command.
It’s extremely rare for anything to end up in there as LXD usually handles errors much before that.

/var/log/lxd/CONTAINER/forkstart.log

This file will contain errors coming from LXC when starting the container.
It’s extremely rare for anything to end up in there as LXD usually handles errors much before that.

CRIU logs (for live migration)

If you are using CRIU for container live migration or live snapshotting there are additional log files recorded every time a CRIU dump is generated or a dump is restored.

Those logs can also be found in /var/log/lxd/CONTAINER/ and are timestamped so that you can find whichever matches your most recent attempt. They will contain a detailed record of everything that’s dumped and restored by CRIU and are far better for understanding a failure than the typical migration/snapshot error message.

LXD debug messages

As mentioned above, you can switch the daemon to doing debug logging with the –debug option.
An alternative to that is to connect to the daemon’s event interface which will show you all log entries, regardless of the configured log level (even works remotely).

The format from “lxc monitor” is a bit different from what you’d get in a log file where each entry is condense into a single line, but more importantly you see all those “level: dbug” entries

Where to report bugs

LXD bugs

The best place to report LXD bugs is upstream at https://github.com/lxc/lxd/issues.
Make sure to fill in everything in the bug reporting template as that information saves us a lot of back and forth to reproduce your environment.

Ubuntu bugs

If you find a problem with the Ubuntu package itself, failing to install, upgrade or remove. Or run into issues with the LXD init scripts. The best place to report such bugs is on Launchpad.

On an Ubuntu system, you can do so with: ubuntu-bug lxd
This will automatically include a number of log files and package information for us to look at.

CRIU bugs

Bugs that are related to CRIU which you can spot by the usually pretty visible CRIU error output should be reported on Launchpad with: ubuntu-bug criu

Do note that the use of CRIU through LXD is considered to be a beta feature and unless you are willing to pay for support through a support contract with Canonical, it may take a while before we get to look at your bug report.

Contributing to LXD

LXD is written in Go and hosted on Github.
We welcome external contributions of any size. There is no CLA or similar legal agreement to sign to contribute to LXD, just the usual Developer Certificate of Ownership (Signed-off-by: line).

We have a number of potential features listed on our issue tracker that can make good starting points for new contributors. It’s usually best to first file an issue before starting to work on code, just so everyone knows that you’re doing that work and so we can give some early feedback.

Building LXD from source

You’ll want to fork the upstream repository on Github and then push your changes to your branch. We recommend rebasing on upstream LXD daily as we do tend to merge changes pretty regularly.

Running the testsuite

LXD maintains two sets of tests. Unit tests and integration tests. You can run all of them with:

sudo -E make check

To run the unit tests only, use:

sudo -E go test ./...

To run the integration tests, use:

cd test
sudo -E ./main.sh

That latter one supports quite a number of environment variables to test various storage backends, disable network tests, use a ramdisk or just tweak log output. Some of those are:

LXD_BACKEND: One of “btrfs”, “dir”, “lvm” or “zfs” (defaults to “dir”)
Lets your run the whole testsuite with any of the LXD storage drivers.

LXD_CONCURRENT: “true” or “false” (defaults to “false”)
This enables a few extra concurrency tests.

LXD_DEBUG: “true” or “false” (defaults to “false”)
This will log all shell commands and run all LXD commands in debug mode.

LXD_INSPECT: “true” or “false” (defaults to “false”)
This will cause the testsuite to hang on failure so you can inspect the environment.

LXD_LOGS: A directory to dump all LXD log files into (defaults to “”)
The “logs” directory of all spawned LXD daemons will be copied over to this path.

LXD_OFFLINE: “true” or “false” (defaults to “false”)
Disables any test which relies on outside network connectivity.

LXD_TEST_IMAGE: path to a LXD image in the unified format (defaults to “”)
Lets you use a custom test image rather than the default minimal busybox image.

LXD_TMPFS: “true” or “false” (defaults to “false”)
Runs the whole testsuite within a “tmpfs” mount, this can use quite a bit of memory but makes the testsuite significantly faster.

LXD_VERBOSE: “true” or “false” (defaults to “false”)
A less extreme version of LXD_DEBUG. Shell commands are still logged but –debug isn’t passed to the LXC commands and the LXD daemon only runs with –verbose.

The testsuite will alert you to any missing dependency before it actually runs. A test run on a reasonably fast machine can be done under 10 minutes.

Sending your branch

Before sending a pull request, you’ll want to confirm that:

Your branch has been rebased on the upstream branch

All your commits messages include the “Signed-off-by: First Last <email>” line

Once that’s all done, open a pull request on Github. Our Jenkins will validate that the commits are all signed-off, a test build on MacOS and Windows will automatically be performed and if things look good, we’ll trigger a full Jenkins test run that will test your branch on all storage backends, 32bit and 64bit and all the Go versions we care about.

This typically takes less than an hour to happen, assuming one of us is around to trigger Jenkins.

Once all the tests are done and we’re happy with the code itself, your branch will be merged into master and your code will be in the next LXD feature release. If the changes are suitable for the LXD stable-2.0 branch, we’ll backport them for you.

Conclusion

I hope this series of blog post has been helpful in understanding what LXD is and what it can do!

This series’ scope was limited to the LTS version of LXD (2.0.x) but we also do monthly feature releases for those who want the latest features. You can find a few other blog posts covering such features listed in the original LXD 2.0 series post.

Use Ubuntu on Windows (“bash”)

For this, you need to use Windows 10 and have the Windows subsystem for Linux enabled.
With that done, start an Ubuntu shell by launching “bash”. And you’re done.
The LXD client is installed by default in the Ubuntu 16.04 image.

Interact with the remote server

Regardless of which method you picked, you’ve now got access to the “lxc” command and can add your remote server.

Using the native build does have a few restrictions to do with Windows terminal escape codes, breaking things like the arrow keys and password hiding. The Ubuntu on Windows way uses the Linux version of the LXD client and so doesn’t suffer from those limitations.

MacOS client

Even though we do have MacOS CI through Travis, they don’t host artifacts for us and so don’t have prebuilt binaries for people to download.

Conclusion

The LXD client can be built on all the main operating systems and on just about every architecture, this makes it very easy for anyone to interact with existing LXD servers, whether they’re themselves using a Linux machine or not.

Thanks to our pretty strict backward compatibility rules, the version of the client doesn’t really matter. Older clients can talk to newer servers and newer clients can talk to older servers. Obviously in both cases some features will not be available, but normal container worflow operations will work fine.

What’s Ubuntu Core?

Ubuntu Core is a version of Ubuntu that’s fully transactional and entirely based on snap packages.

Most of the system is read-only. All installed applications come from snap packages and all updates are done using transactions. Meaning that should anything go wrong at any point during a package or system update, the system will be able to revert to the previous state and report the failure.

The current release of Ubuntu Core is called series 16 and was released in November 2016.

Note that on Ubuntu Core systems, only snap packages using confinement can be installed (no “classic” snaps) and that a good number of snaps will not fully work in this environment or will require some manual intervention (creating user and groups, …). Ubuntu Core gets improved on a weekly basis as new releases of snapd and the “core” snap are put out.

Requirements

As far as LXD is concerned, Ubuntu Core is just another Linux distribution. That being said, snapd does require unprivileged FUSE mounts and AppArmor namespacing and stacking, so you will need the following:

An up to date Ubuntu system using the official Ubuntu kernel

An up to date version of LXD

Creating an Ubuntu Core container

The Ubuntu Core images are currently published on the community image server.
You can launch a new container with:

The container will take a few seconds to start, first executing a first stage loader that determines what read-only image to use and setup the writable layers. You don’t want to interrupt the container in that stage and “lxc exec” will likely just fail as pretty much nothing is available at that point.

Updating the container

If you’ve been tracking the development of Ubuntu Core, you’ll know that those versions above are pretty old. That’s because the disk images that are used as the source for the Ubuntu Core LXD images are only refreshed every few months. Ubuntu Core systems will automatically update once a day and then automatically reboot to boot onto the new version (and revert if this fails).

Then hit your container over HTTP and you’ll get to your newly deployed Nextcloud instance.

If you feel like testing the latest LXD straight from git, you can do so with:

stgraber@dakara:~$ lxc config set ubuntu-core security.nesting true
stgraber@dakara:~$ lxc exec ubuntu-core bash
root@ubuntu-core:~# snap install lxd --edge
lxd (edge) git-c6006fb from 'canonical' installed
root@ubuntu-core:~# lxd init
Name of the storage backend to use (dir or zfs) [default=dir]:
We detected that you are running inside an unprivileged container.
This means that unless you manually configured your host otherwise,
you will not have enough uid and gid to allocate to your containers.
LXD can re-use your container's own allocation to avoid the problem.
Doing so makes your nested containers slightly less safe as they could
in theory attack their parent container and gain more privileges than
they otherwise would.
Would you like to have your containers share their parent's allocation (yes/no) [default=yes]?
Would you like LXD to be available over the network (yes/no) [default=no]?
Would you like stale cached images to be updated automatically (yes/no) [default=yes]?
Would you like to create a new network bridge (yes/no) [default=yes]?
What should the new bridge be called [default=lxdbr0]?
What IPv4 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
What IPv6 address should be used (CIDR subnet notation, “auto” or “none”) [default=auto]?
LXD has been successfully configured.

Conclusion

If you ever wanted to try Ubuntu Core, this is a great way to do it. It’s also a great tool for snap authors to make sure their snap is fully self-contained and will work in all environments.

Ubuntu Core is a great fit for environments where you want to ensure that your system is always up to date and is entirely reproducible. This does come with a number of constraints that may or may not work for you.

And lastly, a word of warning. Those images are considered as good enough for testing, but aren’t officially supported at this point. We are working towards getting fully supported Ubuntu Core LXD images on the official Ubuntu image server in the near future.

Introduction

So far all my blog posts about LXD have been assuming an Ubuntu host with LXD installed from packages, as a snap or from source.

But LXD is perfectly happy to run on any Linux distribution which has the LXC library available (version 2.0.0 or higher), a recent kernel (3.13 or higher) and some standard system utilities available (rsync, dnsmasq, netcat, various filesystem tools, …).

In fact, you can find packages in the following Linux distributions (let me know if I missed one):

Requirements

If you want to use ZFS with LXD, then the “contrib” repository must be enabled and the “zfsutils-linux” package installed on the system

Installing snapd and LXD

Getting the latest stable LXD onto an up to date Debian testing system is just a matter of running:

apt install snapd
snap install lxd

If you never used snapd before, you’ll have to either logout and log back in to update your PATH, or just update your existing one with:

. /etc/profile.d/apps-bin-path.sh

And now it’s time to configure LXD with:

root@debian:~# lxd init
Name of the storage backend to use (dir or zfs) [default=dir]:
Create a new ZFS pool (yes/no) [default=yes]?
Name of the new ZFS pool [default=lxd]:
Would you like to use an existing block device (yes/no) [default=no]?
Size in GB of the new loop device (1GB minimum) [default=15]:
Would you like LXD to be available over the network (yes/no) [default=no]?
Would you like stale cached images to be updated automatically (yes/no) [default=yes]?
Would you like to create a new network bridge (yes/no) [default=yes]?
What should the new bridge be called [default=lxdbr0]?
What IPv4 subnet should be used (CIDR notation, “auto” or “none”) [default=auto]?
What IPv6 subnet should be used (CIDR notation, “auto” or “none”) [default=auto]?
LXD has been successfully configured.

Conclusion

The availability of snapd on other Linux distributions makes it a great way to get the latest LXD running on your distribution of choice.

There are still a number of problems with the LXD snap which may or may not be a blocker for your own use. The main ones at this point are:

All containers are shutdown and restarted on upgrades

No support for bash completion

If you want non-root users to have access to the LXD daemon. Simply make sure that a “lxd” group exists on your system and add whoever you want to manage LXD into that group, then restart the LXD daemon.

Introduction

For those who haven’t heard of Kubernetes before, it’s defined by the upstream project as:

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.

It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon 15 years of experience of running production workloads at Google, combined with best-of-breed ideas and practices from the community.

It is important to note the “applications” part in there. Kubernetes deploys a set of single application containers and connects them together. Those containers will typically run a single process and so are very different from the full system containers that LXD itself provides.

This blog post will be very similar to one I published last year on running OpenStack inside a LXD container. Similarly to the OpenStack deployment, we’ll be using conjure-up to setup a number of LXD containers and eventually run the Docker containers that are used by Kubernetes.

Requirements

This post assumes you’ve got a working LXD setup, providing containers with network access and that you have at least 10GB of space for the containers to use and at least 4GB of RAM.

Outside of configuring LXD itself, you will also need to bump some kernel limits with the following commands:

Setting up the container

Similarly to OpenStack, the conjure-up deployed version of Kubernetes expects a lot more privileges and resource access than LXD would typically provide. As a result, we have to create a privileged container, with nesting enabled and with AppArmor disabled.

This means that not very much of LXD’s security features will still be in effect on this container. Depending on how you feel about this, you may choose to run this on a different machine.

Note that all of this however remains better than instructions that would have you install everything directly on your host machine. If only by making it very easy to remove it all in the end.

And the last setup step is to configure LXD networking inside the container.
Answer with the default for all questions, except for:

Use the “dir” storage backend (“zfs” doesn’t work in a nested container)

Do NOT configure IPv6 networking (conjure-up/juju don’t play well with it)

lxc exec kubernetes -- lxd init

And that’s it for the container configuration itself, now we can deploy Kubernetes!

Deploying Kubernetes with conjure-up

As mentioned earlier, we’ll be using conjure-up to deploy Kubernetes.
This is a nice, user friendly, tool that interfaces with Juju to deploy complex services.

Start it with:

lxc exec kubernetes -- sudo -u ubuntu -i conjure-up

Select “Kubernetes Core”

Then select “localhost” as the deployment target (uses LXD)

And hit “Deploy all remaining applications”

This will now deploy Kubernetes. The whole process can take well over an hour depending on what kind of machine you’re running this on. You’ll see all services getting a container allocated, then getting deployed and finally interconnected.

Once the deployment is done, a few post-install steps will appear. This will import some initial images, setup SSH authentication, configure networking and finally giving you the IP address of the dashboard.

Interact with your new Kubernetes

We can ask juju to deploy a new kubernetes workload, in this case 5 instances of “microbot”:

Introduction

The LXD and AppArmor teams have been working to support loading AppArmor policies inside LXD containers for a while. This support which finally landed in the latest Ubuntu kernels now makes it possible to install snap packages.

Snap packages are a new way of distributing software, directly from the upstream and with a number of security features wrapped around them so that these packages can’t interfere with each other or cause harm to your system.

Requirements

There are a lot of moving pieces to get all of this working. The initial enablement was done on Ubuntu 16.10 with Ubuntu 16.10 containers, but all the needed bits are now progressively being pushed as updates to Ubuntu 16.04 LTS.

Now lets clear the LXD that came pre-installed with the container so we can replace it by the snap.

lxc exec lxd -- apt remove --purge lxd lxd-client -y

Because we already have a stable LXD on the host, we’ll make things a bit more interesting by installing the latest build from git master rather than the latest stable release:

lxc exec lxd -- snap install lxd --edge

The rest is business as usual for a LXD user:

stgraber@castiana:~$ lxc exec lxd bash
root@lxd:~# lxd init
Name of the storage backend to use (dir or zfs) [default=dir]:
We detected that you are running inside an unprivileged container.
This means that unless you manually configured your host otherwise,
you will not have enough uid and gid to allocate to your containers.
LXD can re-use your container's own allocation to avoid the problem.
Doing so makes your nested containers slightly less safe as they could
in theory attack their parent container and gain more privileges than
they otherwise would.
Would you like to have your containers share their parent's allocation (yes/no) [default=yes]?
Would you like LXD to be available over the network (yes/no) [default=no]?
Would you like stale cached images to be updated automatically (yes/no) [default=yes]?
Would you like to create a new network bridge (yes/no) [default=yes]?
What should the new bridge be called [default=lxdbr0]?
What IPv4 subnet should be used (CIDR notation, “auto” or “none”) [default=auto]?
What IPv6 subnet should be used (CIDR notation, “auto” or “none”) [default=auto]?
LXD has been successfully configured.
root@lxd:~# lxd.lxc launch images:archlinux arch
If this is your first time using LXD, you should also run: sudo lxd init
To start your first container, try: lxc launch ubuntu:16.04
Creating arch
Starting arch
root@lxd:~# lxd.lxc list
+------+---------+----------------------+-----------------------------------------------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+---------+----------------------+-----------------------------------------------+------------+-----------+
| arch | RUNNING | 10.106.137.64 (eth0) | fd42:2fcd:964b:eba8:216:3eff:fe8f:49ab (eth0) | PERSISTENT | 0 |
+------+---------+----------------------+-----------------------------------------------+------------+-----------+

And that’s it, you now have the latest LXD build installed inside a LXD container and running an archlinux container for you. That LXD build will update very frequently as we publish new builds to the edge channel several times a day.

Conclusion

It’s great to have snaps now install properly inside LXD containers. Production users can now setup hundreds of different containers, network them the way they want, setup their storage and resource limits through LXD and then install snap packages inside them to get the latest upstream releases of the software they want to run.

That’s not to say that everything is perfect yet. This is all built on some really recent kernel work, using unprivileged FUSE filesystem mounts and unprivileged AppArmor profile stacking and namespacing. There very likely still are some issues that need to get resolved in order to get most snaps to work identically to when they’re installed directly on the host.

If you notice discrepancies between a snap running directly on the host and a snap running inside a LXD container, you’ll want to look at the “dmesg” output, looking for any DENIED entry in there which would indicate AppArmor rejecting some request from the snap.

This typically indicates either a bug in AppArmor itself or in the way the AppArmor profiles are generated by snapd. If you find one of those issues, you can report it in #snappy on irc.freenode.net or file a bug at https://launchpad.net/snappy/+filebug so it can be investigated.

Introduction

When LXD 2.0 shipped with Ubuntu 16.04, LXD networking was pretty simple. You could either use that “lxdbr0” bridge that “lxd init” would have you configure, provide your own or just use an existing physical interface for your containers.

While this certainly worked, it was a bit confusing because most of that bridge configuration happened outside of LXD in the Ubuntu packaging. Those scripts could only support a single bridge and none of this was exposed over the API, making remote configuration a bit of a pain.

That was all until LXD 2.3 when LXD finally grew its own network management API and command line tools to match. This post is an attempt at an overview of those new capabilities.

Basic networking

Right out of the box, LXD 2.3 comes with no network defined at all. “lxd init” will offer to set one up for you and attach it to all new containers by default, but let’s do it by hand to see what’s going on under the hood.

To create a new network with a random IPv4 and IPv6 subnet and NAT enabled, just run:

Similarly, if you want to prevent your container from ever changing its MAC address or forwarding traffic for any other MAC address (such as nesting), you can enable port security with:

root@yak:~# lxc config device set c1 eth0 security.mac_filtering true

DNS

LXD runs a DNS server on the bridge. On top of letting you set the DNS domain for the bridge (“dns.domain” network property), it also supports 3 different operating modes (“dns.mode”):

“managed” will have one DNS record per container, matching its name and known IP addresses. The container cannot alter this record through DHCP.

“dynamic” allows the containers to self-register in the DNS through DHCP. So whatever hostname the container sends during the DHCP negotiation ends up in DNS.

“none” is for a simple recursive DNS server without any kind of local DNS records.

The default mode is “managed” and is typically the safest and most convenient as it provides DNS records for containers but doesn’t let them spoof each other’s records by sending fake hostnames over DHCP.

Using tunnels

On top of all that, LXD also supports connecting to other hosts using GRE or VXLAN tunnels.

A LXD network can have any number of tunnels attached to it, making it easy to create networks spanning multiple hosts. This is mostly useful for development, test and demo uses, with production environment usually preferring VLANs for that kind of segmentation.

So say, you want a basic “testbr0” network running with IPv4 and IPv6 on host “edfu” and want to spawn containers using it on host “djanet”. The easiest way to do that is by using a multicast VXLAN tunnel. This type of tunnels only works when both hosts are on the same physical segment.

This defines a “testbr0” bridge on host “edfu” and sets up a multicast VXLAN tunnel on it for other hosts to join it. In this setup, “edfu” will be the one acting as a router for that network, providing DHCP, DNS, … the other hosts will just be forwarding traffic over the tunnel.

The tunnel id is required here to avoid conflicting with the already configured multicast vxlan tunnel.

And that’s how you make cross-host networking easily with recent LXD!

Conclusion

LXD now makes it very easy to define anything from a simple single-host network to a very complex cross-host network for thousands of containers. It also makes it very simple to define a new network just for a few containers or add a second device to a container, connecting it to a separate private network.

Introduction

First of all, sorry for the delay. It took quite a long time before I finally managed to get all of this going. My first attempts were using devstack which ran into a number of issues that had to be resolved. Yet even after all that, I still wasn’t be able to get networking going properly.

I finally gave up on devstack and tried “conjure-up” to deploy a full Ubuntu OpenStack using Juju in a pretty user friendly way. And it finally worked!

So below is how to run a full OpenStack, using LXD containers instead of VMs and running all of this inside a LXD container (nesting!).

Requirements

This post assumes you’ve got a working LXD setup, providing containers with network access and that you have a pretty beefy CPU, around 50GB of space for the container to use and at least 16GB of RAM.

Setting up the container

OpenStack is made of a lof of different components, doing a lot of different things. Some require some additional privileges so to make our live easier, we’ll use a privileged container.

We’ll configure that container to support nesting, pre-load all the required kernel modules and allow it access to /dev/mem (as is apparently needed).

Please note that this means that most of the security benefit of LXD containers are effectively disabled for that container. However the containers that will be spawned by OpenStack itself will be unprivileged and use all the normal LXD security features.

And the last setup step is to configure LXD networking inside the container.
Answer with the default for all questions, except for:

Use the “dir” storage backend (“zfs” doesn’t work in a nested container)

Do NOT configure IPv6 networking (conjure-up/juju don’t play well with it)

lxc exec openstack -- lxd init

And that’s it for the container configuration itself, now we can deploy OpenStack!

Deploying OpenStack with conjure-up

As mentioned earlier, we’ll be using conjure-up to deploy OpenStack.
This is a nice, user friendly, tool that interfaces with Juju to deploy complex services.

Start it with:

lxc exec openstack -- sudo -u ubuntu -i conjure-up

Select “OpenStack with NovaLXD”

Then select “localhost” as the deployment target (uses LXD)

And hit “Deploy all remaining applications”

This will now deploy OpenStack. The whole process can take well over an hour depending on what kind of machine you’re running this on. You’ll see all services getting a container allocated, then getting deployed and finally interconnected.

Once the deployment is done, a few post-install steps will appear. This will import some initial images, setup SSH authentication, configure networking and finally giving you the IP address of the dashboard.

Access the dashboard and spawn a container

The dashboard runs inside a container, so you can’t just hit it from your web browser.
The easiest way around this is to setup a NAT rule with:

Where “<ip>” is the dashboard IP address conjure-up gave you at the end of the installation.

You can now grab the IP address of the “openstack” container (from “lxc info openstack”) and point your web browser to: http://<container ip>/horizon

This can take a few minutes to load the first time around. Once the login screen is loaded, enter the default login and password (admin/openstack) and you’ll be greeted by the OpenStack dashboard!

You can now head to the “Project” tab on the left and the “Instances” page. To start a new instance using nova-lxd, click on “Launch instance”, select what image you want, network, … and your instance will get spawned.

Once it’s running, you can assign it a floating IP which will let you reach your instance from within your “openstack” container.

Conclusion

OpenStack is a pretty complex piece of software, it’s also not something you really want to run at home or on a single server. But it’s certainly interesting to be able to do it anyway, keeping everything contained to a single container on your machine.

Conjure-Up is a great tool to deploy such complex software, using Juju behind the scene to drive the deployment, using LXD containers for every individual service and finally for the instances themselves.

It’s also one of the very few cases where multiple level of container nesting actually makes sense!

What are snaps?

Snaps were introduced a little while back as a cross-distro package format allowing upstreams to easily generate and distribute packages of their application in a very consistent way, with support for transactional upgrade and rollback as well as confinement through AppArmor and Seccomp profiles.

It’s a packaging format that’s designed to be upstream friendly. Snaps effectively shift the packaging and maintenance burden from the Linux distribution to the upstream, making the upstream responsible for updating their packages and taking action when a security issue affects any of the code in their package.

The upside being that upstream is now in complete control of what’s in the package and can distribute a build of the software that matches their test environment and do so within minutes of the upstream release.

Why distribute LXD as a snap?

We’ve always cared about making LXD available to everyone. It’s available for a number of Linux distribution already with a few more actively working on packaging it.

For Ubuntu, we have it in the archive itself, push frequent stable updates, maintain official backports in the archive and also maintain a number of PPAs to make our releases available to all Ubuntu users.

Doing all that is a lot of work and it makes tracking down bugs that much harder as we have to care about a whole lot of different setups and combination of package versions.

Over the next few months, we hope to move away from PPAs and some of our backports in favor of using our snap package. This will allow a much shorter turnaround time for new releases and give us more control on the runtime environment of LXD, making our lives easier when dealing with bugs.

How to get the LXD snap?

Those instructions have only been tested on fully up to date Ubuntu 16.04 LTS or Ubuntu 16.10 with snapd installed. Please use a system that doesn’t already have LXD containers as the LXD snap will not be able to take over existing containers.

Make sure you don’t have a packaged version of LXD installed on your system.

sudo apt remove --purge lxd lxd-client

Create the “lxd” group and add yourself to it.

sudo groupadd --system lxd
sudo usermod -G lxd -a <username>

Install LXD itself

sudo snap install lxd

This will get the current version of LXD from the “stable” channel.
If your user wasn’t already part of the “lxd” group, you may now need to run:

newgrp lxd

Once installed, you can set it up and spawn your first container with:

Those are all pretty major usability issues and will likely be showstoppers for a lot of people.
We’re actively working with the Snappy team to get those issues addressed as soon as possible and will keep maintaining all our existing packages until such time as those are resolved.

At which point you can fire up your minecraft client, point it at 10.178.150.74 on port 25565 and play with your all new minecraft server!

When you want to get rid of it, just run:

stgraber@dakara:~$ juju destroy-service minecraft

Wait a few seconds and everything will be gone.

Deploying a more complex web application

Juju’s main focus is on modeling complex services and deploying them in a scallable way.

To better show that, lets deploy a Juju “bundle”. This bundle is a basic web service, made of a website, an API endpoint, a database, a static web server and a reverse proxy.

So that’s going to expand to 4, inter-connected LXD containers.

stgraber@dakara:~$ juju deploy cs:~charmers/bundle/web-infrastructure-in-a-box
added charm cs:~hp-discover/trusty/node-app-1
service api deployed (charm cs:~hp-discover/trusty/node-app-1 with the series "trusty" defined by the bundle)
annotations set for service api
added charm cs:trusty/mongodb-3
service mongodb deployed (charm cs:trusty/mongodb-3 with the series "trusty" defined by the bundle)
annotations set for service mongodb
added charm cs:~hp-discover/trusty/nginx-4
service nginx deployed (charm cs:~hp-discover/trusty/nginx-4 with the series "trusty" defined by the bundle)
annotations set for service nginx
added charm cs:~hp-discover/trusty/nginx-proxy-3
service nginx-proxy deployed (charm cs:~hp-discover/trusty/nginx-proxy-3 with the series "trusty" defined by the bundle)
annotations set for service nginx-proxy
added charm cs:~hp-discover/trusty/website-3
service website deployed (charm cs:~hp-discover/trusty/website-3 with the series "trusty" defined by the bundle)
annotations set for service website
related mongodb:database and api:mongodb
related website:nginx-engine and nginx:web-engine
related api:website and nginx-proxy:website
related nginx-proxy:website and website:website
added api/0 unit to new machine
added mongodb/0 unit to new machine
added nginx/0 unit to new machine
added nginx-proxy/0 unit to new machine
deployment of bundle "cs:~charmers/bundle/web-infrastructure-in-a-box-10" completed

Introduction

One of the very exciting feature of LXD 2.0, albeit experimental, is the support for container checkpoint and restore.

Simply put, checkpoint/restore means that the running container state can be serialized down to disk and then restored, either on the same host as a stateful snapshot of the container or on another host which equates to live migration.

Requirements

To have access to container live migration and stateful snapshots, you need the following:

Run LXD directly on the host. It’s not possible to use those features with container nesting.

For migration, the target machine must at least implement the instruction set of the source, the target kernel must at least offer the same syscalls as the source and any kernel filesystem which was mounted on the source must also be mountable on the target.

All the needed dependencies are provided by Ubuntu 16.04 LTS, in which case, all you need to do is install CRIU itself:

Limitations

As I said before, checkpoint/restore of containers is still pretty new and we’re still very much working on this feature, fixing issues as we are made aware of them. We do need more people trying this feature and sending us feedback, I would however not recommend using this in production just yet.

We expect a basic Ubuntu container with a few services to work properly with CRIU in Ubuntu 16.04. However more complex containers, using device passthrough, complex network services or special storage configurations are likely to fail.

Whenever possible, CRIU will fail at dump time, rather than at restore time. In such cases, the source container will keep running, the snapshot or migration will simply fail and a log file will be generated for debugging.

In rare cases, CRIU fails to restore the container, in which case the source container will still be around but will be stopped and will have to be manually restarted.

Sending bug reports

We’re tracking bugs related to checkpoint/restore against the CRIU Ubuntu package on Launchpad. Most of the work to fix those bugs will then happen upstream either on CRIU itself or the Linux kernel, but it’s easier for us to track things this way.

The next post in the LXD series is currently blocked on a pending kernel fix, so I figured I’d do an out of series post on how to use the LXD API directly.

Setting up the LXD daemon

The LXD REST API can be accessed over either a local Unix socket or over HTTPs. The protocol in both case is identical, the only difference being that the Unix socket is plain text, relying on the filesystem for authentication.

To enable remote connections to you LXD daemon, run:

lxc config set core.https_address "[::]:8443"

This will have it bind all addresses on port 8443.

To setup a trust relationship with a new client, a password is required, you can set one with:

lxc config set core.trust_password <some random password>

Local or remote

curl over unix socket

As mentioned above, the Unix socket doesn’t need authentication, so with a recent version of curl, you can just do:

curl over the network (and client authentication)

The REST API is authenticated by the use of client certificates. LXD generates one when you first use the command line client, so we’ll be using that one, but you could generate your own with openssl if you wanted to.

Walking through the API

To keep the commands short, all my examples will be using the local Unix socket, you can add the arguments shown above to get this to work over the HTTPs connection.

Note that in an untrusted environment (so anything but localhost), you should also pass LXD the expected server certificate so that you can confirm that you’re talking to the right machine and aren’t the target of a man in the middle attack.

Everything except the config section is read-only and so doesn’t need to be sent back when updating, so say we want to unset that trust password and have LXD stop listening over https, we can do that with:

Operations

For anything that could take more than a second, LXD will use a background operation. That’s to make it easier for the client to do multiple requests in parallel and to limit the number of connections to the server.

Basic container life-cycle

Going through absolutely everything above would make this blog post enormous, so lets just focus on the most basic things, creating a container, starting it, dealing with files a bit, creating a snapshot and deleting the whole thing.

If you’re doing this by hand as I am right now, there’s no way you can actually access that operation and wait for it to finish as it’s very very quick and data about past operations disappears 5 seconds after they’re done.

Conclusion

The LXD API has been designed to be simple yet powerful, it can easily be used through even the most simple client but also supports advanced features to allow more complex clients to be very efficient.

Our REST API is stable which means that any change we make to it will be fully backward compatible with the API as it was in LXD 2.0. We will only be doing additions to it, no removal or change of behavior for the existing end points.

Support for new features can be detected by the client by looking at the “api_extensions” list from GET /1.0. We currently do not advertise any but will no doubt make use of this very soon.