This paper documents runtime (on the fly) kernel patching on a running system under Linux using direct access to kernel memory. The same algorithms may equally be applicable to other systems. Examples of kernel patching for use by an attacker is provided showing patching of kernel structures to remove a lkm's visibility to lsmod and even the addition of kernel code ala loadable kernel modules (lkm) to a running system without native lkm support in the kernel. Discussion of rebuilding the appropriate sections of the system symbol map (System.map) is provided and implemented.

Linux kernel programming skills of a rudimentary level are required for the purposes of this paper. For the sections on object code linking use ELF, introductory knowledge of the ELF specs while is assumed has the appropriate parts used in this paper explained in further detail to those things directly applicable. It is strongly suggested that the interested reader look at the ELF specs. This is especially true if full understanding of the implementation is to be achieved.

Linux provides a means to locate every symbol used in the kernel whether they are exported or not. This is available by System.map which is created at compile time by gathering the symbol information in the object files used to compile the kernel.

Thus we can see we can identify the symbol by its name. The address we can use to directly look at kernel memory via /dev/kmem.

Kernel symbols that are exported, that is, those symbols that are globally visible and maintained in the kernel and available to user-land via /proc/ksysm and get_kernel_syms (1) do not require the use of System.map for the reasons just stated. This however is only available when lkm support is compiled in the kernel.

Direct access to memory in Linux is provided by two device files. /dev/mem is a file mapping of linear memory. /dev/kmem is a file mapping of all virtual memory available, that is, linear memory plus swap space.

To use the device files as memory requires opening of the device and using it as normal. Random access requires seeking to the required position (memory address) in the file before reading or writing.

System.map is not required to run Linux. Neither is /proc/ksyms when lkm's aren't supported natively in the kernel. However, symbol information is vital for patching the kernel using kmem. A variety of approaches are available however to rebuild symbol information from a running kernel using kmem.

The simplest approach is to copy the first few bytes at a symbol's address and use this as a key to locate the symbol in kmem by string searching. This approach is ideal for locating symbols which represent code, ie functions. The machine code instructions used as the key however will possibly change between architectures and kernel versions. Thus a key lists for function symbols must be maintained for each arch and kernel version. Likewise, if the function uses compile time configurable code, this approach will fail unless we use a key for that specific configuration. In general however, this is highly successful, and a key list construction and searching may be easily automated to find a large number of symbols.

This approach requires that symbol information be available on the machine where the key lists are being built. This information can use System.map or /proc/ksyms, the later being the preferred source as its common on many systems that System.map is not current as its often not copied in kernel compilations.

The implementation of constructing the key lists is build-ksyms-keys.sh which uses a minimum of 15 bytes as the key length, but may increase if the symbol is not able to be identified uniquely with the current key length.

A problem is present with this method however. Its common for a large number of functions to reference other symbols which are not known. A solution to this problem requires that we identify which part of the key is random so we can safely ignore it. To do this, we can build a key list initially which assumes the entire key is to be used, then we update the key list on a new kernel. By looking at what changed in between kernels, we can use this as what to ignore. Also, this gives us a reasonable guess at how useful a key is and if a key is not at all stable and should thus be discarded.

Locating structures in memory may also be approached using a search key technique. For this we try to make known as much information in the structure as possible so we have a feasible key to work with. The key does not have to be a string however, ie, we can search for a structure knowing fields that are dispersed within it. A practical example that is implemented in the provided source code is locating a module's data structure.

The module's name 'name' is a pointer to a string. Thus to locate the module structure we can make two passes in memory. In the first pass we search for all matches to a string, that is the name of the module. We then use the address of these strings in our second pass as the value of 'name' in the module structure.

To make the searching practical, we try to fill in as much information that is known into the partially complete module struct which is out search key. The size of the module 'size' is available in a module listing and we can use this to make our key more useful.

The pages number indicates the size of the module in pages. Thus we can use this in our search for the module structure. The 'state' of the module may be assume to be MOD_RUNNING, that is, a module that is currently running (as opposed to one that needs deletion, or is being initialized). The 'ref' may be assumed NULL, that is, no modules refer to this one. Also 'cleanup' may be assumed not NULL.

This information gives us a key that is practical and will uniquely identify module structures in memory most of the time (but not all the time).

Another method of locating a symbol is that if it is a constant distance to a known symbol location, then the unknown symbol may be found as a function of the known symbol.

'task' is the symbol we are trying to locate. This symbol is not in /proc/ksyms, however 'kstat' is. Thus we can see the following.

ADDR(task) == ADDR(kstat) - NR_TASKS*sizeof(struct task_struct *)

It is possible however, that we have no information to use directly as a key to identify the structure, that is, the structure has from our position random or unknown data, nor can we use known symbols as reference points. If this happens, we may be able to locate the structure indirectly by finding code that references the structure we are looking for.

It is seen that sys_call_table is referenced in the code. If we are able to locate this particular code, we can extract the address of the sys_call_table. It is preferable to use as many instructions as possible to increase the efficiency of the key. It is pointless searching with keys so small they identify many possible matches.

By using the adjacent instructions to that which references the sys_call_table we give ourselves a good key to use. These instructions do not reference anything not known so they can be used. Likewise we don't use the __SMP__ dependent code.

By compiling this code in a dummy program and using objdump to view the machine code we can extract our search key. The search key can be thought of as a partially incomplete string.

The entry.S code does not change often and is seen in many kernels. Thus we do not have to use different keys for each kernel version. Naturally this is only applicable to this specific architecture, however the same principle may be used for others.

The kernel symbol information and direct access to kernel memory provides us with the ability to modify kernel structures. The mem devices provide us with direct memory access and the symbol information gives us addresses to use for location of kernel structures.

The first example is a simple application of how to modify a kernel structure. kroot is the supplied program that changes the uid of a process to become superuser. This is not very practical in real life except when the permissions of the mem devices are insecure - which does happen and has even appeared as kernel bugs in the past. It does serve as a good example.

The algorithm is as follows.

open /dev/kmem

seek to the 'task' symbol address.

read the task table into memory

locate the appropriate task we wish to modify

modify the task to reflect a superuser uid

seek to the specific task in the file

write the modified task

To locate the 'task' symbol address we use the methods given above.

A more practical example is also given. Patching a module kernel structure to remove the lkm from module listing's.

It can be seen that a lkm is not listed of the name is empty, size is 0 and the ref is NULL.

The implemtation of the lkm 'zapper' uses this information to patch the module struct's size and ref members and patches the name to be an empty string. The reverse can also be done provided the size and ref pointer is saved in the 'zapping' process.

Linux provides a system call table represented by the symbol sys_call_table. The table is an array of fixed size and each element of the array is a pointer to the system call, number of which is the index in the array. If no system call is implemented for that syscall number, the element is a NULL pointer. Typically many tens of elements in the syscall table are NULL representing the syscall slot available for use.

Provided we know the address of the new kernel code, we can patch the syscall table and fill an empty slot to point to our new code. To run the kernel code, we make a call to the new syscall from user-land using the int 0x80 mechanism that Linux employs and the new code is thus run.

/dev/kmem gives us direct access to kernel memory, so to insert new code simply means we overwrite part of memory. Importantly to leave the kernel in a stable state we cannot overwrite internal kernel structures that are allocated or being used.

Its possible to use the kmalloc pool in kernel space however we cant be certain if the memory is already being used by kmalloc. Likewise, even if we look at the allocation block headers to find free memory, the operations are not atomic so it leaves to possible racing with the kernel.

The linear layout of kernel memory looks like this.

[compile time kernel memory reserve]
[kmalloc pool]

The kmalloc pool limits are defined in the kernel by the symbols memory_start and memory_end.

This padding can provide us with an ideal positing for new kernel code, as it is totally safe to use as the kernel is not allowed to use this memory.

This is the ideal situation where we can use System.map or another technique to locate memory_start. It is preferable however if we can not to use a symbol who's address is not known at runtime without location.

From /usr/src/linux-2.0.35/arch/i386/mm/init.c

.
.
.
/*
* BAD_PAGE is the page that is used for page faults when linux
* is out-of-memory. Older versions of linux just did a
* do_exit(), but using this instead means there is less risk
* for a process dying in kernel mode, possibly leaving a inode
* unused etc..
*
* BAD_PAGETABLE is the accompanying page-table: it is initialized
* to point to BAD_PAGE entries.
.
.
.
From /usr/src/linux-2.0.35/arc/i386/kernel/head.S
.
.
.
.org 0x3000
ENTRY(empty_bad_page)
.org 0x4000
ENTRY(empty_bad_page_table)
.org 0x5000
ENTRY(empty_zero_page)
.
.
.

This all starts at 0x100000 for a compressed kernel image. Thus empty_bad_page is at 0x100300 and is 4096 bytes in size. Running out of memory is not a common occurrence and we can use the empty_bad_page for our own code without too many fears.

Patching new code into the kernel requires two things. It requires that the code be in kernel space memory and it requires a method of calling that code. The above text has provided has with methods to fulfil both these requirements. The algorithm is thus.

Inserting object code follows the same principles as inserting code directly, however as stated, kernel symbol references, require that the correct address be used by looking at the kernel symbol table. Binding symbol's to addresses is the process of linking, which so far we have been doing manually. For non trivial code manual linking isn't a viable solution, and linking must be added to the algorithms we are using.

ELF is the typical standard used to represent object code on Linux. The paper will thus only refer to linking using ELF objects.

An object code file is referred to as relocatable code when using ELF because that summarizes what it is. It is not fixed to any memory position. It is the responsibility of linking that makes an executable image out of a relocatable object and binds symbols to addresses.

Linking of code is done by relocating the code to a fixed positing. For the most part, the object code does not need to be changed heavily.

The string 's' being part of the relocatable text section in the object has no known absolute position in memory at compile time. Likewise, printk, is an externally defined symbol and its address is also not known at compile time.

Relocation sections in the ELF object are used for describing what needs to be modified (relocated) in the object. In the above case, relocation entries would be made for printk's reference and the string's reference.

String Table

String table sections hold null-terminated character sequences, commonly called strings. The object file uses these strings to represent symbol and section names. One references a string as an index into the string table section. The first byte, which is index zero, is defined to hold a null character. Likewise, a string tables last byte is defined to hold a null character, ensuring null termination for all strings. A string whose index is zero specifies either no name or a null name, depending on the context. An empty string table section is permitted; its section headers sh_size member would contain zero. Non-zero indexes are invalid for an empty string table."

...

Sections

An object file's section header table lets one locate all the file's sections. The section header table is an array of Elf32_Shdr structures as described below. A section header table index is a subscript into this array. The ELF headers e_shoff member gives the byte offset from the beginning of the file to the section header table; e_shnum tells how many entries the section header table contains; e_shentsize gives the size in bytes of each entry."

Symbol Table

An object file's symbol table holds information needed to locate and relocate a program's symbolic definitions and references. A symbol table index is a subscript into this array. Index 0 both designates the first entry in the table and serves as the undefined symbol index. The contents of the initial entry are specified later in this section."

"A relocation section references two other sections: a symbol table and a section to modify. The section headers sh_info and sh_link members, described in ``Sections'' above, specify these relationships. Relocation entries for different object files have slightly different interpretations for the r_offset member.

In relocatable files, r_offset holds a section offset. That is, the relocation section itself describes how to modify another section in the file; relocation offsets designate a storage unit within the second section."

These selected paragraphs and sections from the ELF specifications and header files give us a good high level concept of how a relocatable ELF file can be linked to produce an image capable of being executed.

The process of linking the lkm is as follows.

Identify the file as being in relocatable ELF format

Load each relevant section into memory

For each PROGBITS section set the section address in memory

For each REL (relocation) section, carry out the relocation

Assemble the executable image by copying the sections into their respective positions in memory

An extra step must also be carried out if we know where the initialization and cleanup code of a lkm is.

Identify the file as being in relocatable ELF format

Load each relevant section into memory

For each PROGBITS section set the section address in memory

For each REL (relocation) section, carry out the relocation

Assemble the executable image by copying the sections into their respective positions in memory

Locate and print the address of 'init_module' and 'cleanup_module'

The relocation step may be expanded into the following algorithm.

Evaluate the target section of the relocation entry

Evaluate the symbol table section of the relocation entry

Evaluate the location in the section that the relocation is to apply

Evaluate the address of the symbol that is used in the relocation

Apply the relocation

The actual relocation is best presented by looking at the source. For more information on the relocation types refer to the ELF specifications. Note that we ignore the global offset table completely and any relocation types of its nature.

Evaluating the address of the symbol used in the relocation can also be expanded. If the symbol is local, that is STB_LOCAL, then the symbol is in the current objects symbol table - note also, that the symbol is only visible to the current object. If the symbol is STB_GLOBAL, then the symbol is external to the object and the symbol's address is known in this instance to be a kernel symbol, thus we can use System.map or the methods presented earlier in determining the symbol's address.

In this instance, probably the best source for more detailed information on link editing is the source provided.

In practice, real life lkm's will not fit into the kmalloc pool padding and likewise may be greater than a single page if we use the empty_bad_page. The solution to this is to bootstrap load the lkm by inserting a bootstrap loader into the padding which allocates memory using the normal kernel methods of kmalloc, copying the actual lkm to this allocated memory and executing this. The bootstrap loader can be made sufficiently small and our lkm's can be of any size.

In practice, the only code we are actually required to insert into the kernel to enable bootstrapping is an allocation module that will return a memory address to an allocated block of memory that can fit the real module we want to load and run.

The algorithms and implementations discuss provide ample demonstration that it is possible to use direct access to memory to provide things which would appear only naively possible through native support in the kernel.

The supplied programs can provide practical instances for an attacker to use in his or her array, and show that access to kernel memory is a very useful concept not only in idea but in practice.