The most time-consuming part of operating system development is obtaining
enough drivers to enable the OS to run real
applications which interact with the real world. NetBSD's rump kernels allow reducing
that time to almost zero, for example for developing special-purpose operating
systems for the cloud and embedded IoT devices. This article describes
an experiment in creating an OS by using a rump kernel for drivers.
It attempts to avoid going into full detail on the principles
of rump kernels,
which are available for interested readers from
rumpkernel.org.

A cyclic trend in operating systems is moving things in and out of the
kernel for better performance. Currently, the pendulum is swinging
in the direction of userspace being the locus of high performance.
The anykernel
architecture of NetBSD ensures that the same kernel drivers work in a
monolithic kernel, userspace and beyond. One of those driver stacks is
networking. In this article we assume that
the NetBSD networking stack is run outside of the monolithic kernel in
a rump kernel and survey
the open source interface layer options.

Yesterday I wrote a serious, user-oriented post about running applications directly on the Xen
hypervisor. Today I compensate for the seriousness by writing a
why-so-serious, happy-buddha type kernel hacker post. This post is
about using NetBSD kernel PCI drivers in
rump kernels on Xen, with device access courtesy of Xen PCI passthrough.

There are a number of motivations for running applications directly on
top of the Xen hypervisor without resorting to a full general-purpose OS.
For example, one might want to maximally isolate applications with minimal
overhead. Leaving the OS out of the picture decreases overhead, since
for example the inter-application protection offered normally by virtual
memory is already handled once by the Xen hypervisor.
However, at the same time problems arise: applications expect and use
many services normally provided by the OS, for example files, sockets,
event notification and so forth. We were able to set up a production
quality environment for running applications as Xen DomU's in a few
weeks by reusing hundreds of thousands of lines of unmodified driver and
infrastructure code from NetBSD. While the amount of driver code may
sound like a lot for running single applications, keep in mind that it
involves for example file systems, the TCP/IP stack, stdio, system calls
and so forth -- the innocent-looking open() alone accepts over
20 flags which must be properly handled. The remainder of this post
looks at the effort in more detail.

Google Code-In (GCi) is a project like Google Summer Of Code (GSoC),
but for younger students. While GSoC is aimed at university students,
i.e. for people usually of age 19 or older, GCi wants to recruit
pupils for Open Source projects.

When applying for participation, every project had to create a large number of
potentially small tasks for students. A task was meant to be two hours of work of
an experienced developer, and feasible to be done by a person 13 to 18 years
old. Google selected ten participating organisations (this time, NetBSD
was the only BSD participating) to insert their tasks into Google Melange (the
platform which is used for managing GCi and GSoC).

Then, the students registered at Google Melange, chose a project they wanted to
work on, and claimed tasks to do. There were many chats in the NetBSD code
channel for students coming in and asking questions about their tasks.

After GCi was over, every organisation had to choose their two favourite
students who did the best work. For NetBSD, the choice was difficult, as there
were more than two students doing great work, but in the end we chose Mingzhe
Wang and Matthew Bauer.
These two "grand price winners" were given a trip to Mountain View to visit the
Google headquarters and meet with other GCi price winners.

There were 89 finished tasks, ranging from research tasks (document how other
projects manage their documentation), creating howtos, trying out software on
NetBSD, writing code (ATF tests and Markdown converters and more), writing
manpages and documentation, fixing bugs and converting documentation from the
website to the wiki.

Overall, it was a nice experience for NetBSD. On the one hand, some real work
was done (for many of them, integration is still pending). On the other hand, it
was a stressful time for the NetBSD mentors supervising the students and helping
them on their tasks. Especially, we had to learn many lessons (you will find
them on the wiki page for GCi 2012), but next time, we will do much better.
We will try to apply again next year, but we will need a large bunch of new
possible tasks to be chosen again.

So if you think you have a task which doesn't require great prior knowledge, and
is solvable within two hours by an experienced developer, but also by a 13-18
year old within finite time, feel free to contact us with an outline, or write
it directly to the wiki page for Code-In
in the NetBSD wiki.

As one of five, I've been chosen for participating in Google Summer Of Code (GSoC) this year for NetBSD. My project is to write a binary upgrade tool for NetBSD, optionally with a “live update” functionality.

Why an upgrade tool? – Yes, updating currently is easy. You download the set tarballs from a mirror, unpack the kernel, reboot, unpack the rest, reboot, and done. But this is an exhausting procedure, and you have to know that there are actually updates, and what they affect.

Ever since I realized that the
anykernel
was the best way to construct a modern general purpose operating system
kernel, I have been performing experiments by running unmodified
NetBSD kernel drivers in rump kernels in various environments
(nb. here driver does not mean a hardware device driver, but
any driver like a file system driver or TCP driver).
These experiments have included userspaces of various platforms,
binary kernel modules on Linux
and
others, and
compiling kernel drivers to javascript
and running them natively in a web browser. I have also claimed that
the anykernel allows harnessing drivers from a general purpose OS
onto more specialized embedded computing devices which are becoming the
new norm. This is an attractive possibility because while writing drivers
is easy, making them handle all the abnormal conditions of the real world
is a time-consuming process. Since the above-mentioned experiments
were done on POSIX platforms (yes, even the javascript one), the
experiments did not fully support the claim. The most interesting,
decidedly non-POSIX platform I could think of for experimentation was
the Linux kernel. Even though it had been several years since I last
worked in the Linux kernel, my hypothesis was that it would be easy
and fast to get unmodified NetBSD kernel drivers running in the Linux kernel as rump kernels.

A rump kernel runs on top of the rump kernel hypervisor. The hypervisor
provides high level interfaces to host features, such as memory allocation
and thread creation. In this case, the Linux kernel is the host.
In principle, there are three steps in getting a rump kernel to run in
a given environment. In reality, I prefer a more iterative approach,
but the development can be divided into three steps all the same.

figure out how to compile and run the rump kernel plus hypervisor
in the target environment

implement I/O related hypercalls for whatever I/O you plan to do

Getting basic functionality up and running was a relatively
straightforward process. The only issue that required some thinking was
an application binary interface (ABI) mismatch. I was testing on x86 where Linux kernel ABI uses -mregparm=3,
which means that function arguments are passed in registers where
possible. NetBSD always passes arguments on the stack. When two ABIs
collide, the code may run, but since function arguments passed
between the two ABIs result in garbage, eventually an error
will be hit perhaps in the form of accessing invalid memory.
The C code was easy enough to "fix" by applying the appropriate compiler
flags. In addition to C code, a rump kernel uses a handful of assembly
routines from NetBSD, mostly pertaining to optimizations (e.g. ffs()),
but also to access the atomic memory operations of the platform.
After assembly routines had been handled, it was possible
to load a Linux kernel module which bootstraps a
rump kernel in the Linux kernel and does some
file system operations on the fictional kernfs file system.
A screenshot of the resulting dmesg output is shown below.

It is one thing to execute a computation and an entirely different
thing to perform I/O. To test I/O capabilities, I ran a rump kernel
providing a TCP/IP driver inside the Linux kernel. For a networking
stack to be able to do anything sensible, the interface layer needs
to be able to shuffle packets. The quickest way to implement
the hypercalls for packet shuffling was to use the same method
as a userspace virtual TCP/IP stack might use: read/write packets using
the tap device.
Some might say that doing this from inside the kernel is cheating, but
given that the alternative was to copypaste the tuntap driver and
edit it slightly, I call my approach constructive laziness.

The demo itself opens a TCP socket to port 80 on
vger.kernel.org (IP address 0x43b484d1 if you want to be really precise),
does a HTTP get for "/" and displays the last 500 bytes of the result.
TCP/IP is handled by the rump kernel, not by the Linux kernel.
Think of it as the Linux kernel having two alternative TCP/IP stacks.
Again, a screenshot of the resulting dmesg is shown below. Note that
unlike in the first screenshot, there is no printout for the root file
system because the configuration used here does not include any file
system support. Yes, you can ping 10.0.2.17.

As hypothesized, a rump kernel hypervisor for the Linux kernel
was easy and straightforward to implement. Furthermore, it could be done
without making any changes to the existing hypercall interface thereby
reinforcing the belief that unmodified NetBSD kernel drivers can run
on top of most any embedded firmwares just by implementing a light
hypervisor layer.

There were no challenges in the experiment, only annoyances.
As Linux does not support rump kernels, I had to revert back to
the archaic full OS approach to kernel development. The drawbacks of
the full OS approach include for example suffering multi-second
reboot cycles during iterative development. The other tangential issue
that I spent a disproportionately large amount of time with was thinking
about how releasing this code would affect existing NetBSD code due
to GPL involvement. My conclusion was that this does not matter since
all code used by the current demo is open source anyway, and if someone
wants to use my code in a product, it is their problem, not mine.

For people interested in examining the implementation, I put the
source code for the hypervisor along with the test code in a git repo
here.
The repository also contains the demos linked from
this article. The NetBSD kernel drivers I used are available from ftp.netbsd.org or by
getting buildrump.sh
and running ./buildrump.sh checkout.

There are also some concerns about building a kernel/img on your own.building NetBSD
build.sh is one of the best features of NetBSD. You can cross compile from almost any other unix-like system with very little difficulty.

Some years ago I wrote about the possibility to load and use
standard NetBSD kernel modules in rump kernels on i386 and amd64.
With the recent developments in buildrump.sh and the improved
ability to host rump kernels on non-NetBSD platforms, I decided to try
loading a binary NetBSD kernel module into a rump kernel compiled for
and running on Linux. The hypothesis was that the NetBSD kernel modules
should just work since both the NetBSD kernel and Linux processes use
the ELF calling convention, and all platform details are abstracted by
the rump kernel hypercall layer. Sure enough, after two small fixes to
the hypervisor I could mount and access a FFS file system on Linux by
using ffs.kmod as the driver.

The unique anykernel capability of NetBSD allows the creation of
rump kernels, which are
partially paravirtualized kernels running on top of a high-level
hypervisor. This technology e.g. enables running the
same file system driver in the monolithic kernel or as a
microkernel style server in userspace. POSIX-compatible
systems have been more or less supported as rump kernel hypervisors for
the past 5 years. A long-time goal has been to extend hypervisor
support further, for example to embedded systems. This would bring the
solid driverbase of NetBSD available to such systems with only the cost of
implementing the hypervisor.

To see how far things can go, last week I started toying with the
idea of using a javascript engine as a rump kernel hypervisor. I was
planning to compile the NetBSD kernel sources into javascript and
manually implement the hypervisor. After some
searching for a C->javascript compiler, I found
emscripten, which translates C into
javascript via LLVM bitcode. Not only is the compiler itself extremely
mature, but there is also extensive support for the POSIX API. This meant that
I could not only compile the kernel drivers to javascript with emscripten, I
could also compile the existing POSIX hypervisor and have it work.

The approach of compiling kernel drivers into javascript allows
them to be directly accessed from existing javascript code. Yes,
I did add a sys/arch/javascript into the kernel source
tree. This contrasts the approach taken by
another similar experiment,
where an x86 Linux is run inside a x86 machine emulator running
in a javascript engine.

I have thrown together a small proof-of-concept demo of how to build a
web service with the capability to access file system images using
kernel file system drivers compiled to javascript. I compiled a rump
kernel with support for the FFS, tmpfs and kernfs file systems. This
rump kernel backend is tied to a lightweight web page which passes
requests from forms to the rump kernel and displays results. When the
javascript is run, it downloads an FFS image (rump.data),
bootstraps a rump kernel, and mounts the FFS image r/o at /ffs.
The status can be further manipulated with interactive commands.

The demo is available
here.
I've tested it to work with Firefox and tested it to not work
with Internet Explorer. YMMV with other browsers. Note,
the javascript and the FFS image together are close to 5.5MB in
size, so the page may load for a few moments over a slow link --
javascript is not exactly compact and
whitespace removal was the only size reduction technique I used.
If you're interested in comparing the generated javascript with the C
sources, you can also look at the
unoptimized version (14MB).