Friday, September 14, 2012

MoVP 1.5 KBeast Rootkit, Detecting Hidden Modules, and sysfs

In this post I will analyze the KBeast rootkit using Volatility’s new Linux features.This will include finding hidden modules, network connections, opened
files, and hooked system calls.

If you would like to follow along or recreate the steps taken, please see
the LinuxForensicsWiki
for instructions on how to do so.

Obtaining the Samples
To have a sample to test against I installed the KBeast rootkit in my Debian
virtual machine that was running the 2.6.26-2-686 32-bit kernel.

KBeast

KBeast is a kernel mode rootkit that loads as a
kernel module. It also has a userland component that provides remote access to
the computer. This userland backdoor is hidden from other userland applications
by the kernel module.KBeast also hides
files, directories, and processes that start with a user defined prefix.
Keylogging abilities are also optionally provided.

KBeast gains its control over a computer by hooking
the system call table and by hooking the operations structures used to
implement the netstatinterface to
userland.

We will know go through each piece of functionality
the rootkit offers, how it accomplishes it, and how we can detect it with
Volatility.

Hiding the Kernel Module

Effect
on Forensics

Rootkits hide themselves from the module list as any
unknown modules will be very noticeable to IT security staff as well as to
integrity verifiers that operate in userland.The inability to locate hidden modules can give investigators a false
sense of security and make them trust the output of tools on a live machine
that they should not.

How
KBeast Accomplishes it

To hide its kernel module component, KBeast uses the
same technique that many other modules do, which is breaking itself from the
linked list of loaded kernel modules. This list is exported through /proc/modulesand the lsmodbinary reads this file to list the
loaded modules of a system. This has the effect of the module still being
active in memory, but not detectable with lsmod
or from kernel tools that simply walk the linked list.

How
Volatility Detects This

Volatility leverages sysfsto find modules that are removed from the modules list but
still active. sysfsis a kernel to
userland interface, similar to /proc,
that exports a wide range of kernel information and statistics. One of these
types of data is the loaded modules and their associated information such as
parameters, sections, and reference counts. On a running system, this
information is exported through the /sys/moduledirectory.

Inside of this directory, there is one directory per-kernel
module, and the directory is named the same as the module appears in lsmod.The per-module sub-directories contain more
sub-directories that hold the parameters, sections, and other module data. The
following shows reading of the parameters sent to LiME to obtain the memory
capture for this blog post (the original command was insmod lime.ko "path=kbeast.this format=lime")

#
cat /sys/module/lime/parameters/path

kbeast.this

# cat /sys/module/lime/parameters/format

lime

The linux_check_modulesplugin finds hidden modules by walking the linked list of modules as well
as enumerating all the directories under /sys/module.
These two lists are then compared and any entries that are only found in sysfs are reported as hidden kernel modules. We have yet
to find a rootkit that hides from sysfs
at all, so this method has worked well across a number of malware samples.The following shows this plugin against KBeast:

The sysfsenumeration
code works by finding the module_ksetvariable,
of type kset, that holds all
information for /sys/module. The
plugin then walks each member of the kset’s
entrylist which is of type kobject. Each of these kobject structures represents a module
and its subdirectory immediately under /sys/module.
The names of these directories are then gathered to be compared with the module
list names.

Hooking System Call Table

Effect
on Forensics

System calls are the main mechanism for userland code to
trigger event handling by the kernel. Reading and writing files, sending
network data, spawning and exiting processes, etc are all done through system
calls.The system call table is an array
of function pointers, in which each pointer corresponds to a system call
handler (i.e. sys_readhandles the read system call).

Rootkits often target this table due to the power it
gives them over the control flow of the running kernel.KBeast hooks a number of entries in this
table in order to hide files, processes, and more.

How
KBeast Accomplishes it

During the initialization of its kernel module,
KBeast hooks the unlink, rmdir, unlinkat,
rename, open, kill, read, write, getdents, anddelete_modulesystem calls with its own handlers. These handlers
ensure that files and processes that start with the user-supplied prefix are
hidden and that they cannot be tampered with.

The overwritten kill
system call handler also acts as the mechanism that the rootkit provides in
order for userland processes to elevate privileges. All a userland process has
to do is send a signal with the backdoor signal value and the process will be
elevated. If you read our post yesterday, you
know that the Average Coder rootkit used a mechanism that allowed us to detect
elevated processes. Unfortunately, KBeast does not use this mechanism and
instead uses the proper interfaces provided by the kernel, namely prepare_credsand
commit_creds. This mechanism does not
produce any inconsistencies, so we cannot immediately find processes elevated
by KBeast.

How
Volatility Detects This

Volatility detects all of these hooks by enumerating
and verifying each entry in the system call table. This is implemented in the linux_check_syscallplugin, which, for
every member of the system call table, either prints out the symbol name or, if
it is hooked, prints out the hook address. Since there is anywhere from 300 to
400+ system calls on normal Linux system, it is advisable to redirect the plugin
output to a file and then grep for bad entries as shown here:

We can see in
the first output what some clean entries look like and that the system call table
index is reported along with the symbol name and address. For hooked entries,
we instead see HOOKED in place of a symbol name because the hooked function
points to an unknown address (in this case inside the rootkit’s module).

The plugin only
prints the index of the system call entries and not a name because the system
call table varies widely across distributions and kernel versions, and
determining the name of each one requires the debug build of the kernel
(vmlinux).This may be incorporated into
future versions of the plugins, but will require additions to the current code
base, and in many cases the debug build is not made available by the
distribution package maintainers.

Hiding Network
Connections

Effect
on Forensics

The ability to hide network connections from
userland frustrates not only host investigators, but also network forensics
teams who wish to tie traffic back to a specific computer.The ease in which kernel modules can hide
information from userland makes a strong case for all incident response to be
based on offline memory captures and not on the output from tools running on
the live system.

How
KBeast Accomplishes it

To hide network connections from netstatand the userland interfaces it
uses, KBeast hooks the showmember of the tcp4_seq_afinfo sequence operation
structure.This structure is of type tcp_seq_afinfoand has members of type file_operationsand of type seq_operations.Please refer to yesterday’s blog
post to learn about file_operationsstructures. Sequence operation structures provide a generic mechanism to
display information inside of the /procfilesystem.
This structure has the members start,
show, next, stop, and the wrapping code provides handling of partial seeks,
buffered reads, and other complicated logic so that it only has to be
implemented once throughout the entire kernel.

Sequence operations structures are often targeted by
malware because they directly affect what is populated in /proc. By overwriting the show
member of such a structure, a rootkit can easily filter out entries it does
not want to appear in userland.KBeast
effectively hides its backdoored network connection by filtering the showmember of the TCP4 structure.This technique is also used by many other
rootkits.

How
Volatility Detects This

To detect KBeast’s overwriting of network sequence
operation structures, the linux_check_afinfoplugin walks the file_operationsand
sequence_operationsstructures of all
UDP and TCP protocol structures including, tcp6_seq_afinfo,
tcp4_seq_afinfo, udplite6_seq_afinfo,
udp6_seq_afinfo, udplite4_seq_afinfo, andudp4_seq_afinfo, and verifies each
member. This effectively detects any tampering with the interesting members of
these structures. The following output shows this plugin against the VM infected
with KBeast:

This plugin
reports and verifies that the showmember
is indeed hooked and that the system is compromised.

Analyzing the Userland Backdoor

Effect
on Forensics

The kernel module provides cover for the attacker by
hiding any process or files that start with a user-defined prefix or any network
connection on a specified port. By default, the prefix is set to “_h4x_”, but
the rootkit’s README recommends changing it to something that is not so simple
to grep for. For this demo, I just left it as the default. The port number to hide is also a compile time configuration option
chosen by the user.

The userland backdoor consists of a simple
application that listens on the hidden network port, requires a password, and
then spawns a bash shell with the privileges of root if the password is
correct.

How
KBeast Accomplishes it

As stated in the section on Hooking the System Call
table, these userland activities are hidden by hooking the system call table
and the sequence operations structure of TCP. Once connected to the backdoor,
the attacker can perform a wide range of attacks and post-compromise activity.
We will know focus on recovering this activity.

How
Volatility Detects This

Fortunately for Volatility’s users, particularly
those with a baseline of the system they are analyzing or a copy of ps output from the infected system, finding
the hidden process is trivial. The output of linux_pslist can simply be compared with that of the baseline or ps. Since KBeast hides processes by
hooking the system call table, the process list is untouched and the hidden
process will be in Volatility’s output but not the others. In the case of my
infected image the _h4x_bd process
has a PID of 2777:

Since we know the PID is 2777, we can then
investigate the rest of the application’s activities using Volatility. First,
we want to determine if any processes have the backdoor as a parent process. We
can use the linux_pstreeplugin to
determine this and it will show us what programs were executed by the backdoor:

# python vol.py --profile=LinuxDebianx86 -f
kbeast.lime linux_pstree

Volatile Systems Volatility Framework 2.2_rc1

NamePidUid

<snip>

._h4x_bd27770

..bash30530

...sleep30770

<snip>

This plugin lists the parent/child relationship
between processes by adding a ‘.’ for each depth in the hierarchy. The displayed
portions of the output show us that the backdoor is active with a spawned bash
shell and that this shell ran the sleep
command.We can then use the linux_psauxplugin to display the
command line arguments of each of these processes and their start time:

In this output
we can see that bash was run in
interactive mode and that sleepwas
passed a parameter of 100. In a real
incident response situation, this can determine what parameters were sent to a
wide range of tools used during post-compromise activity.

Now that we know
a connection was active to the backdoor at the time of the compromise, we want
to recover the network connections associated with it. We can use the linux_netstatplugin with the backdoor’s
PID to accomplish this:

This shows us that the backdoor is listening on port
13377and that there is an active
connection from 192.168.110.140on port 41745.
We also see a previous connection in the CLOSE_WAIT
state on port 41744. As we will see in a
future blog post on recovering network data, we could attempt to recover the
packets associated with these connections by using the linux_sk_buff_cache andlinux_pkt_queues
plugins.Having the IP address and port
pairs also allows us to focus network forensics investigations on only the
streams associated with the communication channels of the malware.

At this point, we have found the processes and
network activity associated with the backdoor, all of which would be hidden
from us on a live system, and are able to dig deep into the workings of the
process. Now our goal is to discover the hidden directory that the backdoor is
placed in as the keylogging file is stored in the same directory.We can use linux_proc_mapfor this:

And by looking
at the mapping starting at 0x8048000,
we see that our backdoor binary is loaded at that address and that its full
path is /usr/_h4x_/_h4x_bd. Since the
directory name has the hidden prefix, this directory would not show on a live
machine, and we would have to analyze a disk image to find it. Timelining
would be a good method to narrow down the results quickly.

We can partially
recover the backdoor binary by using the linux_dump_mapcommand:

This invocation
focuses on PID 2777 (the network backdoor) and tells the plugin to write the
mapping to the h4xbdfile. This will
only partially recover the file though as the binary is not loaded directly
from disk into the process’s memory and instead its sections are spread
throughout the address space. We can verify this with the fileand readelfcommands:

Note that the filecommand see it as an ELF file, but readelfis unable to process the file. To
recover the file in-tact, we need to acquire it from the page cache using the linux­_find_fileplugin. This is because
the page cache holds all the physical pages backing a file in memory without any
modifications.

Which shows us that the symbol table is in-tact and
that 131 symbols were present. (Thanks to the malware author for not stripping
his bins ;). In fact, if we hash the recovered binary from memory and the
backdoor binary on the infected VM, the hashes will match exactly.

As a final step, we will quickly perform binary
analysis of the binary recovered from memory. Since the password, hidden port,
secret signal number, etc are all compile time options, they will be different
per instance of the sample, but can be recovered with simple reverse
engineering. To start this process, we find symbols from the binary that may be
interesting, by using nm and
filtering for functions (code).

# nm h4xbd |
grep -wi "t"

08048b70 t __do_global_ctors_aux

08048770 t __do_global_dtors_aux

08048b6a T __i686.get_pc_thunk.bx

08048b00 T __libc_csu_fini

08048b10 T __libc_csu_init

08048b9c T _fini

08048584 T _init

08048740 T _start

08048906 T bindshell

0804881d T enterpass

080487f4 T error_ret

080487d0 t frame_dummy

08048ace T main

From this output, the functions bindshelland enterpass look
interesting. If we load the binary into gdb
and disassemble this function we notice a few things:

What becomes
immediately apparent is that we have a call to strncmp at 0x080488a8,
which is likely where the password is check is contained, and that we see other
hardcoded strings in the address range of 0x8048eXX.
At address 0x0804889a, we can see one
of these strings being placed on the stack as a parameter to the check string
call. If we investigate these addresses, we see that the password (“h4x3d”) is
contained in cleartext and that the other strings in the same memory region
contain the backdoor’s login banner, debug information, the hidden directory “/usr/_h4x_”, and other interesting
information.

(gdb) x/s 0x8048ee6

0x8048ee6:"h4x3d"

(gdb) x/30s 0x8048e00

0x8048e7d:""

0x8048e7e:""

0x8048e7f:""

0x8048e80:"ERROR! Error occured on your system!"

0x8048ea5:""

0x8048ea6:""

0x8048ea7:""

0x8048ea8:"Password [displayed to screen]: "

0x8048ec9:"<< Welcome To The Server >>\n"

0x8048ee6:"h4x3d"

0x8048eec:"Wrong!\n"

0x8048ef4:"socket"

0x8048efb:"bind"

0x8048f00:"listen"

0x8048f07:"Daemon running with PID = %i\n"

0x8048f25:"/usr/_h4x_"

0x8048f30:
"/bin/bash"

If we analyze
the bindshellfunction, we
find more configuration information about the particular KBeast instance:

At this point we
have done a fairly thorough job of analyzing the rootkit and can perform very
effective analysis against it.If
needed, we could even write Volatility plugins to automatically recover the
configuration parameters directly from memory.

Conclusion

We have thoroughly
investigated the KBeast rootkit, including its internals, artifacts left on a
system, and interactions with the attackers who place it on a system.This includes hooking the system call table,
overwriting network operation structures, and allowing “stealth” access to the
compromised computer over the network.

In next week’s
Linux posts, we will analyze another rootkit, Jynx, which requires more plugins to analyze, and we will have a
blog post on analyzing network information with Volatility.If you have any questions or comments please
use the comment section of the blog or you can find me on Twitter (@attrc).