Archive for the ‘linux’ Category

Loading arbitrary kernel modules dynamically has always been a gray area between usability oriented and security oriented Linux developers & users. In this post I will present what options are available today from the Linux kernel and the most popular kernel hardening patch, the grsecurity. Those will give you some ideas on how those projects deal with the threat of Linux kernel’s LKMs (Loadable Kernel Modules).

Threat
This can be split to two main categories, allowing dynamic LKM loading introduces the following two threats:

Malicious LKMs. That’s more or less rootkits or similar malware that an adversary can load for various operations, most commonly to hide specific activities from the user-space.

Vulnerable LKM loading. Imagine that you have a 0day exploit on a specific network driver but this is not loaded by default. If you can trigger a dynamic loading then you can use your code to exploit it and compromise the system. This is what this vector is about.

Linux kernel and KSPP
The KSPP (Kernel Self-Protection Project) of the Linux kernel tried to fix this issue with the introduction of the kernel modules access restriction. Below you can see the exact description that Linux kernel’s documentation has for this restriction.

Restricting access to kernel modules
The kernel should never allow an unprivileged user the ability to load specific
kernel modules, since that would provide a facility to unexpectedly extend the
available attack surface. (The on-demand loading of modules via their predefined
subsystems, e.g. MODULE_ALIAS_*, is considered “expected” here, though additional
consideration should be given even to these.) For example, loading a filesystem
module via an unprivileged socket API is nonsense: only the root or physically
local user should trigger filesystem module loading. (And even this can be up
for debate in some scenarios.)
To protect against even privileged users, systems may need to either disable
module loading entirely (e.g. monolithic kernel builds or modules_disabled
sysctl), or provide signed modules (e.g. CONFIG_MODULE_SIG_FORCE, or dm-crypt
with LoadPin), to keep from having root load arbitrary kernel code via the
module loader interface.

The most restrictive way is via modules_disabled sysctl variable which is available by default on the Linux kernel. This can either be set dynamically as you see here.

sysctl -w kernel.modules_disabled=1

Or permanently as part of the runtime kernel configuration as you can see here.

echo 'kernel.modules_disabled=1' >> /etc/sysctl.d/99-custom.conf

In both cases, the result is the same. Basically, the above change its default value from “0” to “1”. You can find the exact definition of this variable in kernel/sysctl.c.

If we look into kernel/module.c we will see that if modules_disabled has a non-zero value it is not allowing LKM loading (may_init_module()) or even unloading (delete_module() system call) of any LKM. Below you can see the module initialization code that requires both the SYS_MODULE POSIX capability, and modules_disabled to be zero.

Looking in kernel/kmod.c we can also see another check, before the kernel module loading request is passed to call_modprobe() to get loaded in the kernel, the __request_module() function verifies that modprobe_path is set, meaning the LKM is not loaded via an API or socket instead of /sbin/modprobe command.

The above were the features that Linux kernel had for years to protect against this threat. The downside though is that completely disabling loading and unloading of LKMs can break some legitimate operations such as system upgrades, reboots on systems that load modules after boot, automation configuring software RAID devices after boot, etc.

To deal with the above, on 22 May 2017 the KSPP team proposed a patch to __request_module() (still to be added to the kernel) which follows a different approach.

What you see here is that in the very early stage of the kernel module loading security_kernel_module_request() is invoked with the module to be loaded as well as allow_cap variable which can be set to either “0” or “1”. If its value is positive, the security subsystem will trust the caller to load modules with specific predifned (hardcoded) aliases. This should allow auto-loading of specific aliases. This was done to close a design flaw of the Linux kernel where although all modules required the CAP_SYS_MODULE capability to load modules (which is already checked as shown earlier), the network modules required the CAP_NET_ADMIN capability which completely bypassed the previously described controls. Using this modified __request_module() it is ensured that only specific modules that are allowed by the security subsystem will be able to auto-load. However, it is also crucial to note that to this date, the only security subsystem that utilizes security_kernel_module_request() hook is the SELinux.

Before we move on with grsecurity, it is important to note that in 07 November 2010 Dan Rosenberg proposed a replacement of modules_disabled, the modules_restrict which was a copy of grsecurity’s logic. It had three values, 0 (disabled), 1 (only root can load/unload LKMs), 2 (no one can load/unload – same as modules_disabled). You can see the check that it was adding to __request_module() below.

However, this was never added to the upstream kernel so there is no need to dive more into the details behind it. Just as an overview, here is the proposed kernel configuration option documentation for modules_restrict.

modules_restrict:
A toggle value indicating if modules are allowed to be loaded
in an otherwise modular kernel. This toggle defaults to off
(0), but can be set true (1). Once true, modules can be
neither loaded nor unloaded, and the toggle cannot be set back
to false.
A value indicating if module loading is restricted in an
otherwise modular kernel. This value defaults to off (0),
but can be set to (1) or (2). If set to (1), modules cannot
be auto-loaded by non-root users, for example by creating a
socket using a packet family that is compiled as a module and
not already loaded. If set to (2), modules can neither be
loaded nor unloaded, and the value can no longer be changed.

grsecurity
Unfortunately, grsecurity stable patches are no longer publicly available. For this reason, in this article I will be using the grsecurity patch for kernel releases 3.1 to 4.9.24. For the LKM loading hardening grsecurity offers a kernel configuration option known as MODHARDEN (Harden Module Auto-loading). If we go back to ___request_module() in kernel/kmod.c we will see how this feature works.

The check in this case is relatively simple, it verifies that the caller’s UID is the same as the static global UID of root user. This ensure that only users with UID=0 can load kernel modules which completely eliminates the cases of unprivileged users exploiting flaws that are allowing them to request kernel module loading. To overcome the network kernel modules issue grsecurity followed a different approach which maintains the capability check (which is currently used by a very limited amount of security subsystems) but redirects all loading to the ___request_module() function to ensure that only root can load them.

Furthermore, grsecurity identified that a similar security design flaw also exists in the filesystem modules loading (still to be identified and fixed in the upstream kernel), which was fixed in a similar manner. Below is the grsecurity version of fs/filesystems.c’s get_fs_type() function which is ensuring that filesystem modules are loaded only by root user.

This Linux kernel design flaw allows loading of non-filesystem kernel modules via mount. How grsecurity detects those is quite clever and can be found in simplify_symbols() function of kernel/module.c. What it does is ensuring that the the arguments of the module are copied to the kernel side, and then checks the module’s loading information in the symbol table to ensure that the loaded module is trying to register a filesystem instead of any arbitrary kernel module.

To help in detection of malicious users trying to exploit this Linux kernel design flaw, grsecurity has also an alerting mechanism in place which immediately logs any attempts to load Linux kernel modules that are not filesystems using this design flaw. Meaning loading a kernel module via “mount” without that being an actual filesystem module.

As Mathias Krausse pointed out, the ‘msg_namelen’ member of the ‘msghdr’ structure remains uninitialized resulting in kernel information leak. Below is how this structure is defined in include/linux/socket.h header file.

This was very nice vulnerability reported by Andrew Honig of Google. The bug is triggered when a user specifies an invalid IOAPIC_REG_SELECT value which is reachable via read KVM I/O device operation as you can see below.

It calculates and initializes the value of ‘redir_index’ from the user controlled ‘ioapic->ioregsel’ variable and then uses it as an index to ‘ioapic->redirtbl[]’ array. If this value is larger than IOAPIC_NUM_PINS it will result in invalid memory access. Here is how IOAPIC_NUM_PINS is defined in virt/kvm/ioapic.h header file.

#define IOAPIC_NUM_PINS KVM_IOAPIC_NUM_PINS

And this is because it is architecture specific. For IA64 is defined in include/uapi/asm/kvm.h as 48 and for x86 in arch/x86/include/uapi/asm/kvm.h as 24. As you might have noticed there is an ASSERT() call to make this check but of course, this will only take effect in the debug builds.
The fix was to replace that ASSERT() call with a range check like this.

This is a really nice vulnerability killed by Andy Honig. It is particularly interesting because it allows host kernel memory corruption through guest GPA (Guest Physical Address) manipulation. If we have a look in arch/x86/kvm/x86.c we can see the following code.

So by utilizing the ‘MSR_KVM_SYSTEM_TIME’ kvmclock MSR a user can set ‘vcpu->arch.time_page’ through gfn_to_page() call that uses the user derived ‘data’ information. As Andy Honig mentioned in his commit, the arbitrary write occurs when kmap atomic attempts to obtain a pointer to the time structure page and performing a memcpy() to it starting at the user controlled offset. The fix was to add a check that verifies that the provided value does not exceed the structure’s boundaries.

Recently Lars-Peter Clausen committed a change on Linux kernel that fixes a format string vulnerability in the EXT3 filesystem code. The susceptible code resides in fs/ext3/super.c but to better understand it we need to have a look on how ext3_msg() is defined first.

So, it should be called passing the following three mandatory arguments:
– Pointer to the super-block structure
– Prefix string
– Format string
And of course, any variables to be printed. As Lars-Peter Clausen noticed, there were two cases where there was no prefix defined. This makes the format string argument to be passed as prefix and any variables to be processed as the format string. Here are these two cases:

If the equivalent /dev/ttyUSB device file is in use while the device is disconnected then any call to chase_port() (used to chase the port, close and flush it) will lead to NULL pointer dereference since there is no longer a ‘tty’ associated with it. The fix was to add a simple check for this case.

As Dave Chinner pointed out, if we try to walk a filesystem and the extent map has corrupted block number (out of range address) the call to xfs_perag_get() above will trigger a NULL pointer dereference.