Vanishing Features of the 2.6 Kernel

Many developers are eagerly awaiting the 2.6 Linux kernel. The feature
freeze has passed, with a code freeze planned for January and final release
slated for the second quarter of 2003. There is considerable excitement about
anticipated enhancements, especially regarding scalability and performance.

However, some developers may first notice what doesn't work anymore. Some
techniques and APIs have been removed, and existing device drivers and modular
plugins may no longer work. At the same time, it will take some time to take
advantage of new features and to find replacements for old ones.

Some deprecated techniques, such as task queues, have finally been
eliminated. Other facilities, including in-kernel Web acceleration, have been
supplanted by newer advances. Other changes, notably banishing the system call
table from the list of exported symbols available to modules, have flowed more
from philosophical and licensing issues than from technical considerations.

Export of the System Call Table

The Linux kernel has a monolithic architecture; it is one big program. All
parts of the kernel are visible to each other unless their scope has been
explicitly limited. Arguments are passed on the stack, as in any other
C program. At the same time, Linux makes extensive use of modules: facilities that may be loaded and unloaded dynamically. (These are often, but not always
device drivers.) Modules can only see explicitly exported symbols (functions,
variables, etc.). Unless the kernel or a previously loaded module includes
the statement EXPORT_SYMBOL(foobar);, the module cannot refer to
foobar().

Extensive modularization does not render the kernel any less monolithic. The
critical difference between monolithic and microkernels stems from how
components communicate with each other. As long as the Linux kernel prefers
function calls to message passing, its basic structure will remain
monolithic.

The system call table is a vector containing the addresses of the functions
executed whenever a system call is made from user space. When invoking a
system call, the kernel receives the number of the call, the number of
arguments, and the arguments themselves. It uses the call number as an offset
into the table and places the arguments in the registers; they're not passed on
the stack. Then it jumps to the appropriate address to execute the system
call.

Exporting the system call table allows modules to substitute system calls
with replacements of their own devising. To replace the basic kernel
read() system call requires a simple code fragment:

On the practical side, it is easy to incur race conditions, especially on
multi-processor systems where the replacement happens while an application is
using the system call. Various locking techniques can offer some protection,
but the details are non-trivial. However, the abolition of this method is not
primarily due to practical difficulties.

Some system calls penetrate deep into kernel's heart. Binary-only modules,
where the source is not available under a GPL-compatible license, have enjoyed
the use of this technique. Exported symbols have been visible to all
modules.

The rules governing binary modules and GPL violations have always been
fuzzy. Some argue that it is permissible for any such module to restrict itself
to exported symbols. Others maintain it depends on whether or not the module
fiddles with core kernel facilities. The line between central and peripheral
matters has always been very gray.

To sharpen this delineation, the 2.4.10 version of modutils, which handles
loading and unloading of modules, introduced module licenses. In addition, the
EXPORT_SYMBOL_GPL macro, introduced in the 2.4.11 kernel, created
two classes of exported symbols. Only modules with an acceptable open-source
license can have access to symbols exported under the GPL. All previously
exported symbols were grand fathered in.

This led to some loud arguments. Perhaps if the macro had been called
EXPORT_SYMBOL_INTERNAL, it would have shown an intent of
differentiating between modules implementing central and peripheral kernel
facilities, rather than making a choice based on the kernel programmer's
licensing philosophy.

Choosing to use EXPORT_SYMBOL_GPL(sys_call_table) would have
satisfied many objections. Instead, the more draconian choice of embargoing all
export of the system call table occurred. Red Hat did this in the patched
2.4.18 kernel shipped with Red Hat Linux 8.0, and Linus Torvalds did the same
in the 2.5.41 development kernel. As a result, a module can no longer replace
a system call through the simple code above. Its replacement adds support to
register new system calls dynamically. This feature may continue to grow.

Most observers foresee a tightening of the limits on binary modules. This
may very well break some rather expensive commercial Linux products, but that
doesn't seem to bother most kernel developers. Reminding the purveyors of
binary modules that they continue to operate at the pleasure of the Linux
kernel developers and their open-source licenses is seen to be a necessary
(even enjoyable) task. It has probably always been true that the only way to
protect investment in Linux deployment of drivers and other kernel facilities
(not applications) is to go open source, even if that is
difficult for commercial enterprises to absorb. Recent developments seem to
re-emphasize this.