Documentation and guides

This article is part of a series documenting how MirageOS applications run under
Xen. This article is about "events"; i.e. how
can an app wait for input to arrive and tell someone that output is available?

Background: Xen, domains, I/O, etc

A running virtual machine under Xen is known as a domain.
A domain has a number of virtual CPUs (vCPUs) which run until the Xen scheduler
decides to pre-empt them, or until they ask to block via a hypercall (a
system call to the hypervisor). A typical domain has no direct hardware access
and instead performs I/O by talking to other privileged driver domains (often
domain 0) via Xen-specific disk and network protocols. These protocols use two
primitives:

granting another domain access to your memory (which then
may be shared or copied); and

sending and receiving events to and from another domain via
event channels.

This article focuses on how events work; a future article will describe how
shared memory works.

What is an event channel?

An event channel is a logical connection between (domain_1, port_1) and
(domain_2, port_2) where port_1 and port_2 are integers, like TCP port numbers
or Unix file descriptors. An event sent from one domain will cause the other
domain to unblock (if it hasn't been "masked"). To understand how event
channels are used, it's worth comparing I/O under Unix to I/O under Xen:

When a Unix process starts, it runs in a context with environment variables,
pre-connected file descriptors and command-line arguments. When a Xen domain
starts, it runs in a context with a
start info page,
pre-bound event channels and pre-shared memory for console and xenstore.

A Unix process which wants to perform network I/O will normally connect sockets
(additional file descriptors) to network resources, and the kernel will take
care of talking protocols like TCP/IP. A Xen domain
which wants to perform network I/O will share memory with- and then bind event
channels to- network driver domains, and then exchange raw
ethernet frames. The Xen domain will contain its own TCP/IP stack
(such as mirage-tcpip).

When a Unix process wants to read or write data via a file descriptor
it can use select(2) to wait until data
(or space) is available, and then use
read(2) or
write(2), passing pointers to data buffers
as arguments. When a Xen domain wants to wait for data (or space) it will block
until an event arrives, and then send an event to signal that data has been
produced or consumed. Note that neither blocking nor sending take buffers as
arguments since under Xen, data (or metadata) is placed into shared memory
beforehand. The events are simply a way to say, "look at the shared buffers
again".

How do event channels work?

Every domain maps a special
shared info
page which contains bitmaps representing the state of each event channel. This
per-channel state consists of:

evtchn_pending: which means "an unprocessed event has been received, you should
check your shared memory buffers (or whatever else is associated with this
channel)"; and

evtchn_mask: which means "I'm not interested in events on this channel atm,
don't bother interrupting me until I clear the mask".

Every vCPU has a
vcpu_info
record in the shared info page, which stores two relevant domain-global (not
per event channel) bits:

evtchn_upcall_pending: which means "at least one of the event channels has received an event"; and

Note that all MirageOS guests are single vCPU and therefore we can simplify things
by relying on the (single) per-vCPU evtchn_upcall_mask rather than the fine-grained
evtchn_mask (normally a multi-vCPU guest would use the evtchn_upcall_mask to
control reentrant execution and the evtchn_mask to coalesce event wakeups).

Note the shared info page is shared between the domain and the hypervisor
without any locks, so an architecture-specific protocol must be used to access
it (usually via C macros with names like test_and_set_bit)

How does MirageOS handle Xen events?

MirageOS applications running on Xen are linked with
a small C library
derived from
mini-os. This library
takes care of initial boot: mapping the shared info page and initialising the
event channel state. Once the domain state is setup, the OCaml runtime is
initialised and the
OCaml OS.Main.run callback
is evaluated repeatedly until it returns false, signifying exit.

The Activations module keeps a counter and a condition variable per event channel,
using the condition variable to wake any threads which are already blocked and the
counter to prevent a thread from blocking just after an event has been received.

If there is no "work to do", then control passes to
mirage-platform/xen/runtime/kernel/main.c:caml_block_domain
which sets a timer and calls SCHEDOP_block. When Xen wakes up the domain, control
passes first to a global
hypervisor callback
which is where an OS would normally inspect the event channel bitmaps and call
channel-specific interrupt handlers.
In MirageOS's case all we do is clear the vCPU's evtchn_upcall_pending flag and
return, safe in the knowledge that the SCHEDOP_block call will now return, and
the main OCaml loop will be executed again.

Summary

Now that you understand how events work under Xen and how MirageOS uses them,
what else do you need to know?
Future articles in this series will answer the following questions: