Ponyhof

Dysfunctional Programming

Exec in VM

Almost everyone these days relies on continuous integration. And it seems, once
you got accustomed to it, you never want to work without it again.
Unfortunately, most CI systems lack cross-architecture capabilities. As a
systems engineer with lots of C projects, I was desperately looking for a
solution to run my tests on little-endian, big-endian, 32bit, and 64bit
machines. So far, without any luck. Hence, I patched together qemu, docker,
fedora, and some bash scripts to get a tool that allows me to execute scripts
from the command-line in a VM ad-hoc.

My ultimate goal is to type vmrun make as replacement for make,
and it spawns a virtual machine, mounts the current directory into the machine,
executes make inside of it, returning the exit-code to my shell. Of
course, it could be extended to support selecting the target architecture
and/or OS image to us. So eventually, it might look something like:

As a developer, I would love having this at hand. I can easily compile and
run projects in foreign architectures, without the requirement of setting up
non-volatile VMs, moving data in and out of the machine, and also getting
automation and scripting support.

Containers already allow this kind of setup. Using docker or systemd-nspawn
you can get something similar already:

This, however, has one major drawback: This can only run native binaries. If
you want to run code in a foreign architecture, you need a kernel for that
architecture as well. There are options like qemu-user, though they cannot
provide perfect compatibility. They only get you so far.

Hence, you need some machine emulator. So how about we execute the image inside
of qemu, rather than in a container? Sounds easier than it is:

Needs to Boot: Unlike in a container, the virtual machine needs to boot a
kernel, user-space, and prepare the execution environment. This means, we
cannot simply specify a script or binary to execute by qemu. We must actually
boot the image and instruct the image to execute a given binary.

One way to get this to work on Fedora is to craft a special .service file
and pull it in after boot is done. Make the service file execute your binary
and then poweroff the machine when done, or on failure.

No Exit-Code Propagation: The qemu emulator does not propagate the
exit-code of the code executed in the virtual machine. Hence, we need a
side-channel to detect whether the script executed successfully. This is
easily done by hooking up a separate serial-line and making your OS write
success into it, once everything succeeded.

Maybe someone wants to hook up a qemu extension to propagate Exit-Codes?

No Bind Mounts: The biggest issue is, we cannot simply bind-mount the
directory of the caller into the virtual machine. This is particularly bad,
because there is no simple alternative solution. The closest possible
solution I am aware of is to share the directory via NFS or 9pfs.

Maybe someone can figure out a way to do this. All my attempts failed. While
I successfully shared the directory, either performance suffered, or random
features failed, which were expected by some development tools (e.g.,
file-locks or mmap failed). I am not saying the tools are broken, but just
that I couldn’t make it work. Help welcome!

(Also be aware that you suddenly run into UID and permission issues. The
entire qemu machine runs as an unprivileged user, so it will only be able to
access/write files as that user. But inside of the VM, you are free to use
sudo and friends. There is no way to propagate this to the outside. This
might be fine, but it is a source of confusion.)

No Image Hubs: While docker gave us image stores for free (e.g.,
Docker Hub, Quay.io, etc.), there is nothing like it for virtual machine
images. Companies seem unwilling to provide the world with free terrabytes of
storage.

Solution: Use docker.

While docker stores images in a format unsuitable to qemu, we can still use
its storage. I simply took my XFS-qcow2 image-file and threw it into a docker
container. While at it, I threw in a qemu binary with all its dependencies as
well. This combined image can now be pushed to docker repositories and be
hosted on Docker Hub and friends. As a consumer, you simply fetch the docker
image and execute the qemu-binary inside of it, including its embedded OS
image.

I went forth and threw together all the bits and pieces. But, sadly, I cannot
provide you the vmrun tool as I described it above. I simply ran into too
many issues around sharing a directory. However, I did end up with something
close:

This command executes $PWD/myscript.sh inside of a fedora armv7hl image,
hosted by an x86_64 qemu. For reproducability, I tagged the image at the time
of this blog-post as
cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-20180110-1. If you
want the latest image, use
cherrypick/cherryimages-fedora-vmrun:ci-x86_64-to-armv7hl-latest. Other tags
exist as well. Just check out the repository, if interested. The Dockerfile
sources as of the time of this post can be found on
github.

Unlike the vmrun tool I described above, this shares the input script
read-only. Furthermore, it shares the input as FAT16 volume (qemu can create
this on-the-fly via the vvfat driver), so its size is quite limited, and
filesystem attributes are mostly discarded. In the end, its only use is to push
a script into the machine to execute (alternatively, you can push an entire
directory into the machine, but the entrypoint must be named main).

For my personal use, I now added a script that fetches a git-repository, runs
the embedded tests, and returns.
In combination with this docker-qemu-image, I can easily run my CI on
foreign architectures. Maybe some day I will pick this up again and get a
proper vmrun tool (or maybe someone else does?). Until then, I will stick
to the reduced version, as it serves my needs. Sadly, there are still too many
variables that cannot be auto-detected (How many memory to give to the VM?
Which devices to forward? Which CPU features to enable?), and too many hacks
required (String-conversions required between different command-lines… Getting
a distribution to boot fast in these containers… Sharing data correctly into
and out of the VM…).

In the end, I think it is just too much a hassle to turn into a project I can
maintain and support. The tools I needed do not provide proper APIs, but
require me to lump together command-lines, PID-files, and magic configurations.
Maybe some day we will get there? Until then, lets make use of qemu-user and
avoid system integration tests…