Debugging Kernel Modules with User Mode Linux

Programming in kernel space has always been left to the gurus. Few people have the courage, knowledge and patience to work in the realm of interrupts, devices and the always painful kernel panic.

When you write programs in user space,
the worst thing that can happen to your program is a core dump.
Your program did something very wrong, so the operating system
decided to give you all of its memory and state information back to
you in the form of a core file. Core files can then be used to
debug your program and fix the problem.

When you program in the kernel, there is no operating system
to step in and safely stop your code from running and tell you that
you have a problem. The Linux kernel is pretty nice to its own
code. Sometimes it can survive a panic, if you are doing something
wrong that is relatively benign (these panics are typically called
oopses). But, there is nothing to stop your code from overwriting
or accessing memory locations from anywhere in the kernel's address
space. Also, if your module hangs, the kernel hangs (technically,
your current kernel thread hangs, but the result is usually the
same).

These problems may sound benign to the naïve, but they
are serious issues. If the kernel panics, you rarely know exactly
what caused the panic. The typical solution is to put printks
everywhere and hope that you stumble across the problem before the
messages are lost to the reboot. All of this is assuming that you
do not corrupt your filesystem. I have lost an entire filesystem
before due to a poorly timed panic (and due to the fact that a
badly initialized pointer was overwriting some of ext2's internal
structures).

The first thing you learn when kernel programming is to keep
all your code on NFS. Files remain safe on another machine. But,
that does not save you the time of having e2fsck run every time you
panic. Plus, you still can lose your filesystem, even if your
source code is safe on another machine.

So, with all of these issues, it is not surprising how few
have entered the realm of kernel programming. Now, all that can
change.

Virtual Machines and UML

Back in the mainframe days, when timesharing machines were
the norm, the idea of a virtual machine was born. A virtual machine
is an encapsulated computer completely at your disposal. A program
on a virtual machine has no real access to the physical hardware.
All hardware access is controlled by the machine or
emulator.

VMware
(www.vmware.com) has a
very powerful virtual machine that allows you to run any x86-based
operating system under Windows NT, 2000, XP or Linux. SoftPC (an
8086 emulator allowing you to run Windows and DOS programs) has
been available on Motorola 68k-based computers (i.e., the
Macintosh) since 1988.

True virtual machines are sometimes too expensive for the
learner's budget. (VMware Workstation for Linux costs $299 US from
their web site.) Thankfully, there is now a free alternative for
those only wanting to run Linux: User-Mode Linux (UML).

User-Mode Linux
(user-mode-linux.sourceforge.net)
is not a complete virtual machine. It does not emulate different
hardware or give you the ability to run other operating systems.
But, it does allow you to run a kernel in user
space. This gives you several benefits when it comes to
development: the host filesystem is safe from corruption, the
virtual filesystem is undoable (which makes it safe from
corruption), you can run multiple machines on one machine (this is
useful for testing intermachine communication, i.e., network
messages, without having to use multiple machines) and it is very
easy to run the kernel in a debugger.

Setting up UML

Running UML is easy. You can download one of the binary
packages (kernel binaries, plus a couple of tools), or you can
download the kernel patch. You also need to download a filesystem.
I'd recommend playing with the binaries first, then building a
custom kernel to suit your needs. The HOWTO covers all of these
topics and more.

One useful benefit of UML is Copy-on-Write files. These files
allow you to modify a virtual filesystem, without modifying the
base filesystem. All writes or modifications to a filesystem are
stored in these files, typically ending with the extension
.cow.

So, when you are working, and you panic the filesystem, all
you do is remove the .cow file (which will be recreated), and your
corrupted filesystem is restored to its pristine version. (There
are also tools to incorporate the changes in a .cow file back into
the original filesystem, if you want to keep your changes.)

Debugging Modules

Once you have UML up and running, it's time to play. I've
written a very simple kernel module for testing. It uses four
devices, /dev/gentest[0-3]. The module treats each device a little
differently. Device 1 is a sink (just like /dev/null). Device 2
stores a string for later retrieval. You can read the status of the
module from device 3, and device 0 could be any of the other three
devices, depending on how it is configured. (You can change the
configuration with ioctl calls.) The kernel module is available
from
www.frascone.com/kHacking/gentest-0.1.tar.gz.