The MirageOS Blogon building functional operating systems

[ *Due to continuing development, some of the details in this blog post are now out-of-date. It is archived here.* ]

On all hosts running Xen, there is a critical service called xenstore.
Xenstore is used to allow untrusted user VMs to communicate with trusted system VMs, so that

virtual disk and network connections can be established

performance statistics and OS version information can be shared

VMs can be remotely power-cycled, suspended, resumed, snapshotted and migrated.

If the xenstore service fails then at best the host cannot be controlled (i.e. no VM start or shutdown)
and at worst VM isolation is compromised since an untrusted VM will be able to gain unauthorised access to disks or networks.
This blog post examines how to disaggregate xenstore from the monolithic domain 0, and run it as an independent stub domain.

Recently in the Xen community, Daniel De Graaf and Alex Zeffertt have added support for
xenstore stub domains
where the xenstore service is run directly as an OS kernel in its own isolated VM. In the world of Xen,
a running VM is a "domain" and a "stub" implies a single-purpose OS image rather than a general-purpose
machine.
Previously if something bad happened in "domain 0" (the privileged general-purpose OS where xenstore traditionally runs)
such as an out-of-memory event or a performance problem, then the critical xenstore process might become unusable
or fail altogether. Instead if xenstore is run as a "stub domain" then it is immune to such problems in
domain 0. In fact, it will even allow us to reboot domain 0 in future (along with all other privileged
domains) without incurring any VM downtime during the reset!

The new code in xen-unstable.hg lays the necessary groundwork
(Xen and domain 0 kernel changes) and ports the original C xenstored to run as a stub domain.

Meanwhile, thanks to Vincent Hanquez and Thomas Gazagnaire, we also have an
OCaml implementation of xenstore which, as well as the offering
memory-safety, also supports a high-performance transaction engine, necessary for surviving a stressful
"VM bootstorm" event on a large server in the cloud. Vincent and Thomas' code is Linux/POSIX only.

Ideally we would have the best of both worlds:

a fast, memory-safe xenstored written in OCaml,

running directly as a Xen stub domain i.e. as a specialised kernel image without Linux or POSIX

We can now do both, using Mirage! If you're saying, "that sounds great! How do I do that?" then read on...

Step 1: remove dependency on POSIX/Linux

If you read through the existing OCaml xenstored code, it becomes obvious that the main uses of POSIX APIs are for communication
with clients, both Unix sockets and for a special Xen inter-domain shared memory interface. It was a fairly
painless process to extract the required socket-like IO signature and turn the bulk of the server into
a functor. The IO signature ended up looking approximately like:

For now the dependency on Lwt is explicit but in future I'll probably make it more abstract so we
can use Core Async too.

Step 2: add a Mirage Xen IO implementation

In a stub-domain all communication with other domains is via shared memory pages and "event channels".
Mirage already contains extensive support for using these primitives, and uses them to create fast
network and block virtual device drivers. To extend the code to cover the Xenstore stub domain case,
only a few tweaks were needed to add the "server" side of a xenstore ring communication, in addition
to the "client" side which was already present.

In Xen, domains share memory by a system of explicit "grants", where a client (called "frontend")
tells the hypervisor to allow a server (called "backend") access to specific memory pages. Mirage
already had code to create such grants, all that was missing was a few simple functions to receive
grants from other domains.

The Mirage "main" module necessary for a stub domain looks pretty similar to the normal Unix
userspace case except that it:

arranges to log messages via the VM console (rather than a file or the network, since a disk or network device cannot be created without a working xenstore, and it's important not to introduce a bootstrap
problem here)

The Makefile looks like a regular Makefile, invoking ocamlbuild. The whole lot is built with
OASIS with a small extension added by Anil to set a few options
required for building Xen kernels rather than regular binaries.

Running a stub Xenstored is a little tricky because it depends on the latest and
greatest Xen and Linux PVops kernel. In the future it'll become much easier (and probably
the default) but for now you need the following:

xen-4.2 with XSM (Xen Security Modules) turned on

A XSM/FLASK policy which allows the stubdom to call the "domctl getdomaininfo". For the moment it's safe to skip this step with the caveat that xenstored will leak connections when domains die.

a Xen-4.2-compatible toolstack (either the bundled xl/libxl or xapi with some patches)