4.2. Debugging by Printing

The most common debugging technique
is monitoring, which in applications programming is done by calling
printf at
suitable
points. When you are debugging kernel code, you can accomplish the
same goal with printk.

4.2.1. printk

We used the printk
function in earlier chapters with the simplifying assumption that it
works like printf. Now it's
time to introduce some of the differences.

One of the differences is that
printk lets you classify messages according to
their severity by associating different
loglevels,
or priorities,
with the messages. You usually indicate the loglevel with a macro.
For example, KERN_INFO, which we saw prepended to
some of the earlier print statements, is one of the possible
loglevels of the message. The loglevel macro expands
to a string, which is concatenated to the
message text at compile time; that's why there is no
comma between the priority and the format string in the following
examples. Here are two examples of printk
commands, a debug message and a critical message:

A
printk statement with no specified priority
defaults to DEFAULT_MESSAGE_LOGLEVEL, specified in
kernel/printk.c as an integer. In the 2.6.10
kernel, DEFAULT_MESSAGE_LOGLEVEL is
KERN_WARNING, but that has been known to change in
the past.

Based
on the loglevel, the kernel may print the message to the current
console, be it a text-mode terminal, a serial port, or a parallel
printer. If the priority is less than the integer variable
console_loglevel, the message is delivered to the
console one line at a time (nothing is sent unless a trailing newline
is provided). If both klogd and
syslogd are running on the system, kernel
messages are appended to /var/log/messages (or
otherwise treated depending on your syslogd
configuration), independent of console_loglevel.
If klogd is not running, the message
won't reach user space unless you read
/proc/kmsg (which is often most easily done with
the dmesg
command). When using klogd, you should remember
that it doesn't save consecutive identical lines; it
only saves the first such line and, at a later time, the number of
repetitions it received.

The
variable console_loglevel is initialized to
DEFAULT_CONSOLE_LOGLEVEL and can be modified
through the sys_syslog system call. One way to
change it is by specifying the -c switch when
invoking klogd, as specified in the
klogd manpage. Note that to change the current
value, you must first kill klogd and then
restart it with the -c option. Alternatively,
you can write a program to change the console loglevel.
You'll find a version of such a program in
misc-progs/setlevel.c in the source files
provided on O'Reilly's FTP site.
The new level is specified as an integer value between 1 and 8,
inclusive. If it is set to 1, only messages of
level 0 (KERN_EMERG) reach the console; if it is
set to 8, all messages, including debugging ones,
are displayed.

It is also possible to
read and modify the console loglevel using the text file
/proc/sys/kernel/printk. The file hosts four
integer values: the current loglevel, the default level for messages
that lack an explicit loglevel, the minimum allowed loglevel, and the
boot-time default loglevel. Writing a single value to this file
changes the current loglevel to that value; thus, for example, you
can cause all kernel messages to appear at the console by simply
entering:

# echo 8 > /proc/sys/kernel/printk

It should now be apparent why the hello.c sample
had the KERN_ALERT; markers; they are there to
make sure that the messages appear on the console.

4.2.2. Redirecting Console Messages

Linux allows for some flexibility in
console logging policies by letting you send messages to a specific
virtual console (if your console lives on the text screen). By
default, the "console" is the
current virtual terminal. To select a different virtual terminal to
receive messages, you can issue ioctl(TIOCLINUX)
on any console device. The following program,
setconsole,
can be used to choose which console receives kernel messages; it must
be run by the superuser and is available in the
misc-progs directory.

The following is the program in its entirety. You should invoke it
with a single argument specifying the number of the console that is
to receive messages.

setconsole uses the special
ioctl command TIOCLINUX,
which implements Linux-specific functions. To use
TIOCLINUX, you pass it an argument that is a
pointer to a byte array. The first byte of the array is a number that
specifies the requested subcommand, and the following bytes are
subcommand specific. In setconsole, subcommand
11 is used, and the next byte (stored in bytes[1])
identifies the virtual console. The complete description of
TIOCLINUX can be found in
drivers/char/tty_io.c, in the kernel sources.

4.2.3. How Messages Get Logged

The
printk function
writes messages into a circular buffer that is _
_LOG_BUF_LEN bytes long: a value from 4 KB to 1 MB chosen
while configuring the kernel. The function then wakes any process
that is waiting for messages, that is, any process that is sleeping
in the syslog system call or that is reading
/proc/kmsg. These two interfaces to the logging
engine are almost equivalent, but note that reading from
/proc/kmsg consumes the data from the log
buffer, whereas the syslog system call can
optionally return log data while leaving it for other processes as
well. In general, reading the /proc file is
easier and is the default behavior for klogd.
The dmesg command can be used to look at the
content of the buffer without flushing it; actually, the command
returns to stdout the whole content of the
buffer, whether or not it has already been read.

If you happen to read the kernel messages by hand, after stopping
klogd, you'll find that the
/proc file looks like a FIFO, in that the reader
blocks, waiting for more data. Obviously, you can't
read messages this way if klogd or another
process is already reading the same data, because
you'll contend for it.

If the circular buffer fills up, printk wraps
around and starts adding new data to the beginning of the buffer,
overwriting the oldest data. Therefore, the logging process loses the
oldest data. This problem is negligible compared with the advantages
of using such a circular buffer. For example, a circular buffer
allows the system to run even without a logging process, while
minimizing memory waste by overwriting old data should nobody read
it. Another feature of the Linux approach to messaging is that
printk can be invoked from anywhere, even from
an interrupt handler, with no limit on how much data can be printed.
The only disadvantage is the possibility of losing some data.

If the klogd process
is running, it retrieves kernel messages and dispatches them to
syslogd, which in turn checks
/etc/syslog.conf to find out how to deal with
them. syslogd differentiates between messages
according to a facility and a priority; allowable values for both the
facility and the priority are defined in
<sys/syslog.h>. Kernel messages are logged
by the LOG_KERN facility at a priority
corresponding to the one used in printk (for
example, LOG_ERR is used for
KERN_ERR messages). If klogd
isn't running, data remains in the circular buffer
until someone reads it or the buffer overflows.

If you want to avoid clobbering your system log with the monitoring
messages from your driver, you can either specify the
-f (file) option to klogd
to instruct it to save messages to a specific file, or customize
/etc/syslog.conf to suit your needs. Yet another
possibility is to take the brute-force approach: kill
klogd and verbosely print messages on an unused
virtual terminal,[1] or issue the command cat
/proc/kmsg from an unused xterm.

[1] For example, use setlevel
8; setconsole 10 to set up terminal 10 to display
messages.

4.2.4. Turning the Messages On and Off

During the early stages
of driver development, printk can help
considerably in debugging and testing new code. When you officially
release the driver, on the other hand, you should remove, or at least
disable, such print statements. Unfortunately,
you're likely to find that as soon as you think you
no longer need the messages and remove them, you implement a new
feature in the driver (or somebody finds a bug), and you want to turn
at least one of the messages back on. There are several ways to solve
both issues, to globally enable or disable your debug messages and to
turn individual messages on or off.

Here we show one way to code printk calls so you
can turn them on and off individually or globally; the technique
depends on defining a macro that resolves to a
printk (or printf ) call
when you want it to:

Each print statement can be enabled or disabled by removing or adding
a single letter to the macro's name.

All the messages can be disabled at once by changing the value of the
CFLAGS variable before compiling.

The same print statement can be used in kernel code and user-level
code, so that the driver and test programs can be managed in the same
way with regard to extra messages.

The following code fragment
implements these features and comes directly from the header
scull.h:

The
symbol PDEBUG is defined or undefined, depending
on whether SCULL_DEBUG is defined, and displays
information in whatever manner is appropriate to the environment
where the code is running: it uses the kernel call
printk when it's in the kernel
and the libc call fprintf
to the standard error when run in user space. The
PDEBUGG symbol, on the other hand, does nothing;
it can be used to easily "comment"
print statements without removing them entirely.

To simplify the process further, add the following lines to your
makefile:

The macros shown in this section depend on a gcc
extension to the ANSI C preprocessor that supports macros with a
variable number of arguments. This gcc
dependency shouldn't be a problem, because the
kernel proper depends heavily on gcc features
anyway. In addition, the
makefile depends on
GNU's version of make ; once
again, the kernel already depends on GNU make,
so this dependency is not a problem.

If
you're familiar with the C preprocessor, you can
expand on the given definitions to implement the concept of a
"debug level," defining different
levels and assigning an integer (or bit mask) value to each level to
determine how verbose it should be.

But every driver has its own features and monitoring needs. The art
of good programming is in choosing the best trade-off between
flexibility and efficiency, and we can't tell what
is the best for you. Remember that preprocessor conditionals (as well
as constant expressions in the code) are executed at compile time, so
you must recompile to turn messages on or off. A possible alternative
is to use C conditionals, which are executed at runtime and,
therefore, permit you to turn messaging on and off during program
execution. This is a nice feature, but it requires additional
processing every time the code is executed, which can affect
performance even when the messages are disabled. Sometimes this
performance hit is unacceptable.

The macros shown
in this section have proven themselves useful in a number of
situations, with the only disadvantage being the requirement to
recompile a module after any changes to its messages.

4.2.5. Rate Limiting

If you are not careful, you can find yourself generating
thousands of messages with printk, overwhelming
the console and, possibly, overflowing the system log file. When
using a slow console device (e.g., a serial port), an excessive
message rate can also slow down the system or just make it
unresponsive. It can be very hard to get a handle on what is wrong
with a system when the console is spewing out data nonstop.
Therefore, you should be very careful about what you print,
especially in production versions of drivers and especially once
initialization is complete. In general, production code should never
print anything during normal operation; printed output should be an
indication of an exceptional situation requiring attention.

On the other hand, you may want to emit a log message if a device you
are driving stops working. But you should be careful not to overdo
things. An unintelligent process that continues forever in the face
of failures can generate thousands of retries per second; if your
driver prints a "my device is
broken" message every time, it could create vast
amounts of output and possibly hog the CPU if the console device is
slow—no interrupts can be used to driver the console, even if
it is a serial port or a line printer.

In many cases, the best behavior is to set a flag saying,
"I have already complained about
this," and not print any further messages once the
flag gets set. In others, though, there are reasons to emit an
occasional "the device is still
broken" notice. The kernel has provided a function
that can be helpful in such cases:

int printk_ratelimit(void);

This function should be called before you consider printing a message
that could be repeated often. If the function returns a nonzero
value, go ahead and print your message, otherwise skip it. Thus,
typical calls look like this:

if (printk_ratelimit( ))
printk(KERN_NOTICE "The printer is still on fire\n");

printk_ratelimit works by tracking how many
messages are sent to the console. When the level of output exceeds a
threshold, printk_ratelimit starts returning
0 and causing messages to be dropped.

The behavior of printk_ratelimit can be
customized by modifying
/proc/sys/kernel/printk_ratelimit (the number of
seconds to wait before re-enabling messages) and are
/proc/sys/kernel/printk_ratelimit_burst (the
number of messages accepted before rate-limiting).

4.2.6. Printing Device Numbers

Occasionally, when printing a message from a driver, you will want to
print the device number associated
withp the hardware of interest. It is
not particularly hard to print the major and minor numbers, but, in
the interest of consistency, the kernel provides a couple of utility
macros (defined in <linux/kdev_t.h>) for
this purpose:

Both macros encode the device number into the given
buffer; the only difference is that
print_dev_t returns the number of characters
printed, while format_dev_t returns
buffer; therefore, it can be used as a parameter
to a printk call directly, although one must
remember that printk doesn't
flush until a trailing newline is provided. The buffer should be
large enough to hold a device number; given that 64-bit device
numbers are a distinct possibility in future kernel releases, the buffer should
probably be at least 20 bytes long.