System Panics, Part 1: Preparing for the Worst

03/21/2002

This is Part 1 in a two-part series on system panics. In this column, Michael Lucas
talks about how to prepare a FreeBSD system in case of a panic. In the next column, he'll
talk about what to do when the worst happens.

I've built my reputation on reliability, a process made infinitely easier by FreeBSD.
That's why I felt so shocked when a client called and said, "My server just went down for
the second time in a day."

This client runs an ISP and relies heavily on FreeBSD for his mail and Web services.
His 2.2.8-stable boxes have uptimes approaching a year--they'd be longer, but we had to
rearrange the power cables in the server room one night. This system ran 4-stable and had
been in production for several months.

Instead of a login prompt, the console displayed a message much like this
one:

If you're an inexperienced sysadmin, this can turn your blood cold. Unix in general, and
FreeBSD in particular, generally gives friendly messages that describe what's wrong and
give you a place to start looking, or in the worst case, a term to type into your favorite
search engine. The only word that looks even vaguely familiar in this message is "syncer".
Most people don't know what the syncer is. Most of those who recognize it know better than
to try to fix it. The "mysterious panic" is among the worst situations you can have in
FreeBSD.

The first time this happened to me, several years ago, I scrambled for a piece of paper
and a pen. Eventually I found an old envelope and a broken stub of pencil, and crawled
between the server rack and the rough brick wall. In one hand, I balanced the 6-inch,
black-and-white monitor I dragged back there with me. With the other hand, I held the old
envelope up against the wall. Apparently I had a third hand to copy the panic message to
the envelope, because it somehow got there. Finally, scraped and cramped, I slithered back
out of the rack and victoriously typed the whole mess into an email. Surely the crack
FreeBSD developers would be able to look at this garbage and tell me exactly what had
happened.

After this struggle, the immediate response was quite frustrating. "Can you send a
backtrace?"

I've seen many, many messages to a FreeBSD mailing list reporting problems like this.
They always get the same response I got. Most of these people are never heard from again,
and I understand exactly how they feel. When you've been dealing with a server that
crashes, or (worse) keeps crashing, the last thing you want to do is reconfigure it.

There's a simple way around this problem, however: Set up your server to handle a panic
before the panic happens. Set it up when you install the server. That way, you'll
automatically get a backtrace if it ever crashes. This might seem like a novel idea, and it
certainly isn't emphasized in the FreeBSD documentation, but it make sense. Be ready for
disaster. If it never happens, well, you don't have anything to complain about. If you get
a panic, you're ready. You can present the FreeBSD folks with a decent, full debugging
dump.

The problem with the panic message on my envelope is that it only gives a tiny scrap of
the story. It's like describing your stolen car as "red, with a scratch on the fender."
If you don't give the car's make, model, and VIN number or license plate, you cannot expect
the police to make much headway. Similarly, without more information from your crashing
kernel, the FreeBSD developers can't catch the criminal code.

The standard FreeBSD kernel install removes all the debugging information from the
kernel before installing it. This debugging information includes "symbols," which provide a
map between the machine code and the source code. Such a map can be larger than the actual
program. Nobody wants to run a kernel that's three times larger than it has to be! It also
includes a complete list of source code line numbers, so the developer can learn exactly
where a problem occurred. Without this information, the developer is stuck trying to map a
kernel core to the source code by hand. It's somewhat like trying to assemble a
million-piece puzzle without a box, a picture, or even knowing that you have all the
pieces. This is an ugly job. It's even uglier when you consider that the developer who
needs to do the work is a volunteer.

To prepare for a kernel panic, you need the system source code installed. You need one
(or more) swap partition that is at least one MB larger than your physical memory and
preferably twice as large as your RAM. If you have 512MB of RAM, for example, you need a
swap partition that is 513MB or larger, with 1024MB being preferable. (On a server, you
should certainly have multiple swap partitions on multiple drives!) If you don't have that,
you have to either add another hard drive with an adequate swap partition or reinstall.
While having a /var partition with at least that much disk space free is
helpful, it isn't necessary.

The kernel crash-capturing process works somewhat like this. If a properly configured
system crashes, it will save a core dump of the system memory. You can't save it to a file,
because the crashed kernel doesn't know about files; it only knows about partitions.
The simplest place to write this dump is the swap partition. The dump is placed as close to
the end of the swap partition as possible. Once the crashing system saves the core to swap,
it reboots the computer.

During the reboot, /etc/rc enables the swap partition. It then (probably)
runs fsck on the crashed disks. It has to enable swapping before running
fsck, because fsck might need to use swap space. Let's hope you
have enough swap space that fsck can get everything it needs without
overwriting the dump file lurking in your swap partition. Once the system has a place where
it can save a core dump, it checks the swap partition for a dump. Upon finding a core,
savecore copies it from swap to the proper file, clears the dump from swap,
and lets the reboot proceed. You now have a kernel core file and can use that to get a
backtrace.

The examples given here are for FreeBSD 4-stable. If you're running 3-stable, paths will
be slightly different. If you're running -current, you should have done all of this long
ago.

Your first step to make this work is to build a debugging kernel. I'm assuming that you
know how to build a custom kernel. If you don't, please see the FreeBSD Handbook for
details. All you need to do is add these lines to your kernel configuration.

options DDB
makeoptions DEBUG=-g

The DDB option installs the DDB kernel debugger. (This isn't strictly
necessary, but it can be helpful and doesn't take that much room.) Finally, the
makeoptions you set here tell the system to build a debugging kernel.

When you're configuring your system, you need to decide how you want the system to
behave after a panic. Do you want the computer to reboot, or do you want it to stay at the
panic screen until you manually trigger a reboot? If the system is at a remote location,
you almost certainly want the computer to reboot automatically. If you're at the console,
debugging kernel changes, or if you've discovered a filesystem bug, you almost certainly
want the system to wait for you to tell it to reboot.

If you want the computer to reboot automatically, include the kernel option
DDB_UNATTENDED. Otherwise, the system will wait for you to tell it to reboot.
(Here's a little-known BSD trick for you: You can specify more than one option on a
line.)

options DDB, DDB_UNATTENDED

Once you have the kernel set up the way you want, do the usual dance to configure and
install it. When this finishes, you'll find a file in the kernel compile directory called
kernel.debug. This is your kernel with symbols. Save it somewhere. When this process
fails, one of the frequent causes is losing the debugging kernel and then trying to
debug a crashed kernel with a different kernel.debug. This won't work. I generally
copy kernel.debug to /var/crash/kernel.debug.date, so I can tell when
a particular debug kernel was built. This lets me date-match the current kernel to a
debugging kernel, and it also tells me when a kernel.debug is old enough that I can
delete it.

Now set the proper options in /etc/rc.conf. First, tell the system where
to write the core dump. This is called the dumpdev. FreeBSD uses the swap
partition as the dump device; that's why it has to be slightly larger than your physical
memory. (You can use a UFS partition, but after the crash, it won't be a usable UFS
partition anymore!) You can get the device name from /etc/fstab. Look for a
line with a FSType entry of "swap"; the first entry in that line is the physical
device name. On my laptop, my swap field in /etc/fstab looks
like this:

/dev/ad0s4b none swap sw 0 0

My swap partition is /dev/ad0s4b, so I specify this as the dump device in
/etc/rc.conf.

dumpdev="/dev/ad0s4b"

The next step is to tell your system where to save the dump after the reboot. The
default is /var/crash, but you can change this with rc.conf's
dumpdir setting.

As you become more experienced in saving panics, you might find that you need to adjust
the core-saving behavior. Read savecore(8), and set any appropriate options in
savecore_flags in /etc/rc. One popular flag is -z, which
compresses the core file and can save some disk space. savecore(8) is now smart enough to
automatically eliminate unused memory from the dump, which can save a lot of room.

If you're in front of your computer the next time it crashes, you'll see the panic
message. If the system is set to reboot automatically, numbers will start to flow by,
counting the number of MBs of memory being dumped to disk. Finally, the computer will
reboot. Fdisk runs, and you can watch savecore copy the bad
memory dump to disk.

If your system doesn't reboot automatically, you'll need to enter two commands after the
panic, at the debugger prompt. Typing panic will sync the disks, and
continue will start the reboot process.

You should now have a core dump file in /var/crash. Next time, we'll
discuss what to do with this.