Skillset

Introduction

Libemu is a library which can be used for x86 emulation and shellcode detection. Libemu can be used in IDS/IPS/Honeypot systems for emulating the x86 shellcode, which can be further processed to detect malicious behavior. It can also be used together with Wireshark to pull shellcode off the wire to be analyzed, analyze shellcode inside malicous .rtf/.pdf documents, etc. It has a lot of use-cases and is used in numerous open-source projects like dionaea, thug, peepdf, pyew, etc., and it plays an integral part in shellcode analysis. Libemu can detect and execute shellcode by using the GetPC heuristics, as we will see later in the article.

The very first thing we can do is download Libemu via Git with the following command:

# git clone git://git.carnivore.it/libemu.git

If we would like to know how much code has been written for this project, we can simply execute sloccount, which will output the number of lines for each subdirectory and a total of 43,742 AnsiC code lines and 15 Python code lines. If we would rather take a look at nice graphs, we can visit the Ohloh web page to see something like below, where it’s evident that about 50k lines of code has been written.

The installation instructions can be found at [1], which is why we won’t describe them in this article. We can also install the Pylibemu, so we can interact with Libemu directly from Python.

Creating the Shellcode

Let’s create a simple text case with Metasploit to see how Libemu works. First, we have to create a shellcode with msfpayload, which is a command-line tool specifically built to generate and output various versions of shellcode. Let’s first present all Linux payloads by grepping for the “linux” keyword through msfpayload command output.

# msfpayload -l 2>&1 | grep linux
linux/armle/adduser Create a new user with UID 0
linux/armle/exec Execute an arbitrary command
linux/armle/shell/bind_tcp Listen for a connection, dup2 socket in r12, then execve
linux/armle/shell/reverse_tcp Connect back to the attacker, dup2 socket in r12, then execve
linux/armle/shell_bind_tcp Connect to target and spawn a command shell
linux/armle/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/mipsbe/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/mipsle/shell_bind_tcp Listen for a connection and spawn a command shell
linux/mipsle/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/ppc/shell_bind_tcp Listen for a connection and spawn a command shell
linux/ppc/shell_find_port Spawn a shell on an established connection
linux/ppc/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/ppc64/shell_bind_tcp Listen for a connection and spawn a command shell
linux/ppc64/shell_find_port Spawn a shell on an established connection
linux/ppc64/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/x86/exec Execute an arbitrary command
linux/x86/shell/bind_tcp Listen for a connection, Spawn a command shell (staged)
linux/x86/shell/reverse_tcp Connect back to the attacker, Spawn a command shell (staged)
linux/x86/shell_bind_tcp Listen for a connection and spawn a command shell
linux/x86/shell_bind_tcp_random_port
linux/x86/shell_find_port Spawn a shell on an established connection
linux/x86/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/x86/adduser Create a new user with UID 0
linux/x86/chmod Runs chmod on specified file with specified mode
linux/x86/exec Execute an arbitrary command
linux/x86/meterpreter/bind_ipv6_tcp Listen for a connection over IPv6, Staged meterpreter server
linux/x86/meterpreter/bind_nonx_tcp Listen for a connection, Staged meterpreter server
linux/x86/meterpreter/bind_tcp Listen for a connection, Staged meterpreter server
linux/x86/meterpreter/find_tag Use an established connection, Staged meterpreter server
linux/x86/meterpreter/reverse_ipv6_tcp Connect back to attacker over IPv6, Staged meterpreter server
linux/x86/meterpreter/reverse_nonx_tcp Connect back to the attacker, Staged meterpreter server
linux/x86/meterpreter/reverse_tcp Connect back to the attacker, Staged meterpreter server
linux/x86/metsvc_bind_tcp Stub payload for interacting with a Meterpreter Service
linux/x86/metsvc_reverse_tcp Stub payload for interacting with a Meterpreter Service
linux/x86/read_file Read up to 4096 bytes from the local file system and write it back out to the specified file descriptor
linux/x86/shell/bind_ipv6_tcp Listen for a connection over IPv6, Spawn a command shell (staged)
linux/x86/shell/bind_nonx_tcp Listen for a connection, Spawn a command shell (staged)
linux/x86/shell/bind_tcp Listen for a connection, Spawn a command shell (staged)
linux/x86/shell/find_tag Use an established connection, Spawn a command shell (staged)
linux/x86/shell/reverse_ipv6_tcp Connect back to attacker over IPv6, Spawn a command shell (staged)
linux/x86/shell/reverse_nonx_tcp Connect back to the attacker, Spawn a command shell (staged)
linux/x86/shell/reverse_tcp Connect back to the attacker, Spawn a command shell (staged)
linux/x86/shell_bind_ipv6_tcp Listen for a connection over IPv6 and spawn a command shell
linux/x86/shell_bind_tcp Listen for a connection and spawn a command shell
linux/x86/shell_bind_tcp_random_port
linux/x86/shell_find_port Spawn a shell on an established connection
linux/x86/shell_find_tag Spawn a shell on an established connection (proxy/nat safe)
linux/x86/shell_reverse_tcp Connect back to attacker and spawn a command shell
linux/x86/shell_reverse_tcp2 Connect back to attacker and spawn a command shell

For our testing, we’ll take a look at the linux/x86/shell/reverse_tcp payload, which is used to generate the linux ELF executable as presented below. The msfpayload command is used to create the binary, and the file command is used to check whether the resulting binary is actually ELF executable.

We can also connect with the newly established target and execute a command. In the output below we’ve executed the pwd command, which gave the current directory /root, which means the shell program has been run from the /root directory; this is true, since we’ve copied the malicious executable to that directory.

In this case, we were able to simply use msfpayload to get the shellcode we wanted, but most of the time we have to extract the shellcode from whatever medium it’s being transported in, may it be a .rtf/.pdf document, a network traffic, etc.

Analyzing the Shellcode

Previously, we created the shellcode, which we’ll analyze with Libemu now. For analysis, we can use the sctest program that comes with libemu. The shellcode needs to be passed to sctest on stdin, but we need to pass other parameters as well: -vvv is for verbose output, -S is to read shellcode from stdin, -s is the maximum number of steps to run, and -G is to save dot formatted callgraph. In the output below, you can see that sctest was able to decode quite a large part of the shellcode.

The sctest program emulated each instruction in the shellcode starting with general purpose registers set to zero; each of the emulated instructions have also been highlighted to be easily seen. The “xor ebx, ebx” and “mul ebx” don’t change the values in general purpose registers, since they are already set to zero, but certain flags are set when executed. The “inc ebx” instruction increases the value in ebx by 1, which is registered if we take a look at the value in ebx.

From the instructions above, we can get a fairly good idea what the shellcode does. The code outputted above creates a socket by calling the socket() function and then connects to the host 192.168.1.2 on port 443 by using the connect() function. In order to convert the dot version of the graph to png, we have to execute the following command:

# dot shell.dot -T png -o shell.png

The callgraph is presented on the picture below.

If we use the ndisasm command to disassemle the shell.bin, we’ll basically get the same instructions as are presented above. Let’s present the instructions till the first system call presented with the instruction “int 0x80“. The system call basically reads the system call number from the AH register, which in this case is 0x66 – the socket() – (notice the ‘mov al,0x66’ instruction a couple of lines before?). The instructions below first zero out the ebx/eax registers, then push ebx (value 0) to the stack, which is the first parameter to socket, and increase the value of ebx to 1 and push it on the stack, which is the second parameter to socket. Then a constant value of 0x2 is pushed to the stack as third parameter to socket. Notice how Libemu has simplified the analysis for us, since it automatically figured out that the socket system call is being called, but it also presented it in a clear and easy to understand graph.

We mentioned that Libemu can detect and execute shellcode by using the GetPC heuristics, but let’s now take a look at what that really means. The GetPC abbreviation means Get Program Counter, which constitutes of instructions that can determine its own location in the process’s address space [2]. This is often used in shellcode in decryption routines, where a decrypted version of shellcode together with the decryption routine is used to exploit the target. The decryption routine must first determine the current address on the stack in order to decrypt the encrypted shellcode: often the shellcode is encrypted for different reasons, but most commonly it’s because of null characters or anti-virus evasion. There are three methods that can be used to determine the current instruction pointer address presented below (and summarized after [2]):

Call GetPC: we can detect the current program counter by issuing the call and pop assembly instructions. First we issue a short call to the pop assembly instruction, which pushes the current address to the stack. The pop instruction then takes the return address from the stack and stores in into an arbitrary general purpose register. In [2] we can see a couple variations of such a code, because sometimes we don’t want to use NULL characters and don’t want to be limited to the shellcode length we’re allowed to enter.

FSTENV GetPC: the fstenv instruction is used to store the floating point operating environment into memory; it also stores the address of the previously executed floating point instruction. In order to get the current memory address, we must first execute one of the floating point instructions like fldz and immediately after that the fstenv instruction. The floating point environment stored in memory will also contain the address of the fldz instruction, which we can use in our shellcode.

SEH GetPC: this method is used in Windows operating systems and uses the Structure Exception Handler (SEH) to do its job. The SEH is used by Windows operating system to handle the exception when one occurs. The SEH chain is stored on the stack, and once an instruction has been triggered, the operating system will pass the instruction (including the address of the instruction where the exception was triggered) to each exception handler in turn until one exception handler is able to handle it. To use this to figure out the current program counter, we must first set up an exception handler on the stack, and soon afterwards trigger an exception, at which point our exception handler will be called. In an exception handler, we could read the program counter of the instruction which triggered the exception and used it to further penetrate the system. Using the SEH method is quite difficult in newer versions of Windows (Windows Vista and newer), because Microsoft added additional checks that verify whether the SEH chain has been corrupted prior to passing control to it.

Python Libemu

Pylibemu is a Libemu Python wrapper around the Libemu library and can be installed by issued the following commands.

There are many ways of telling Python where to look for libraries, but the most useful of them is to create a file inside the /etc/ld.so.conf.d/ directory and set its contents to “/opt/libemu/lib/”. Then we have to run the ldconfig command, which is used for configuration of paths searched when a dynamic shared library is needed. Therefore the ldconfig will ensure that “/opt/libemu/lib/” is also checked when searching for a shared library on the system, which will cause Python to find the library and not present an error. In the output below, we can see how we instructed the dynamic loaded to search the /opt/libemu/lib/ directory and we’ve also verified that Python can now import the Pylibemu without any problems.

Now we can use the following Python code in order to iterate over all Emulator methods and print their names as well as their descriptions. This is useful when the class methods are not documented properly to quickly determine what methods are available and what each method does. Alternatively we can also use the “help(pylibemu)” command in Python interpreter to get practically the same information.

The Emulator class contains a lot of cpu_* functions, which are used for setting register values, getting the values from registers, setting/getting the eflags, stepping through the instructions, etc. There are also a lot of memory_* functions used to manipulate memory. Other interesting functions are the following:

prepare: this method is used to prepare the execution environment, which accepts two parameters: the binary shellcode itself and an offset to the GetPC offset. The offset to the GetPC instructions is determined by calling the shellcode_getpc_test method.

env_w32_hook_check: this method checks whether a hooked Win32 API is at the current EIP and returns True if it is, otherwise it returns False.

shellcode_getpc_test: this method tries to identify GetPC (get process counter) code within the shellcode. If the GetPC instructions are identified, an offset to the start of instructions is returned, otherwise a value of -1 is returned.

test: this method is used to test and emulate the shellcode and must always be called after the prepare method.

run: this method

The shellcode_getpc_test function can be found in pylibemu.pyx source code at [3], which has a pyx extension, which signifies the Cython code (the code that was translated from C to Python). The shellcode_getpc_test function looks as presented below. The function creates a new buffer and copies the shellcode into it. Then it calls a the emu_shellcode_test function defined in the Libemu library (libemu/src/emu_shellcode.c), thus the shellcode_getpc_test is used as a wrapper to an actual function.

In order to detect whether the shellcode contains the GetPC instructions, we can use a program like presented below. The code first stores the shellcode into the buf variable and then uses the initialized Libemu emulator and calls the shellcode_getpc_test function to detect the GetPC instructions.

Let’s now take a closer look at the emu_shellcode_test function to determine how the function finds the GetPC instructions. In the code snippet above, we’ve seen that the function accepts three parameters: the first parameter is the emu structure, the second parameter is the shellcode buffer itself, and the third parameter is the length of the shellcode.

The first part of the function reserves some space for local variables on the stack.

After variable initializaton, the following code is called, where the function emu_getpc_check is called passing two parameters: the first parameter is the shellcode buffer, while the second parameter is the offset from the start of the buffer, which is called from 0 to shellcode size (therefore the function is called for every byte in the shellcode).

In the emu_getpc_check function (defined in the emu_getpc.c source file), the shellcode byte at offset is compared to multiple values in a switch statement. There are two case statements, comparing the offset byte to 0xe8 (call instruction) and 0xd9 (fnstenv instruction). These two checks are relative to the previously described checks used to determine whether the shellcode contains GetPC instructions.

The first case statement for the call instruction is presented on the picture below (I’ve removed all the comments and empty lines from the code to make it fit into the picture). The emu_memory_write_block function is used to write a block at a memory location; more specifically, it reserves a block of certain size (the size variable) and writes the whole shellcode (data variable) to the memory location 0x1000. The emu_cpu_eip_set function is used to set the value of the EIP register; more specifically, it copies the instruction stored at 0x1000+offset (the call or fnstenv instruction) to the EIP register. The emu_cpu_reg32_get function gets the 32-bit value stored in a register; more specifically, it gets the value stored in the ESP register and saves it into the espcopy local variable. Afterwards, we’re parsing the instructions at EIP sequentially and checking whether the value stored in ESP register, which points to the top of the stack, equals the old value (before the call instruction). If that happens, it means that we’ve executed a pop instructionand set the ESP register back to old value. If that happens, we return the value 1, indicating the possibility of GetPC instructions.

The emu_cpu_parse function is used to parse the instruction at EIP by determining the length of the instruction, the number of operands, the values of each operand, etc.

The fnstenv use case is presented on the picture below, where we again reserve the memory for usage and set the EIP at 0x1000; thus setting the EIP at the beginning of the shellcode. We didn’t set it to the current fnstenv instruction, because we have to check whether any FPU instruction has previously been executed; therefore, we’re executing the instruction from the beginning of the shellcode till the fnstenv instruction, checking for any FPU instructions. If we detect that FPU instruction was executed before fnstenv instruction, we return the value of 1.

This was the whole function used for checking whether the GetPC instructions are being executed inside the shellcode. We’ve seen that the emu_getpc_check function is executed for every offset in the shellcode, starting from 0 and continuing till the end of the shellcode. The function checks for the call and fnstenv instructions and executes the appropriate checks to check whether GetPC instructions are present.

Conclusion

In this article we’ve created a shellcode with Metasploit framework and analyzed it with Libemu, which can detect system calls and present them in a nice graph by using dot. Libemu supports reading x86 instructions and emulating the instructions to detect shellcode. It also uses GetPC heuristics to check whether the shellcode is using such instructions to get the current program counter, which is normally used in decryption routines by malware.

In the end, we’ve also taken a detailed look at the GetPC detection functions, where each byte of the shellcode is checked if it’s set to 0x8e or 0xd9 values, which effectively correspond to the call and fnstenv instructions. Both use cases then issue further checks in order to determine whether GetPC instructions are contained in the shellcode or not.

When analyzing the Libemu library, we’ve determined that it checks for two out of three methods used to determine the current program counter; it doesn’t detect the SEH GetPC method, which is used especially in older versions of Windows operating systems. In my opinion, it would be partly more work to write such checks, but shouldn’t be difficult. In the end, it’s not as relevant to also include those checks into the library, since SEH overwriting GetPC instructions can only be used in shellcode which was written for Windows XP SP3 or older operating systems.

If you would like to get an understanding of how emulation works in computer systems, then you can look at the Libemu source code, which emulates x86 shellcode. An emulator is basically a program which runs on a platform, but allows instructions for a different platform to be executed. When writing an emulator, we must write every component that is usually handled by the hardware as well as the software in the program itself: the central processing unit, the general purpose registers, the program counter, the stack pointer, the eflags register, the memory, etc. Writing an emulator is not a simple task, which is why we must appreciate the Libemu project, because a lot of work has been spent writing code to emulate x86 instructions. It the end, it’s certainly an interesting project that deserves to be studied, because a lot can be learnt from it; not to mention that many open-source projects use it in order to detect the x86 shellcode.

Dejan Lukan is a security researcher for InfoSec Institute and penetration tester from Slovenia. He is very interested in finding new bugs in real world software products with source code analysis, fuzzing and reverse engineering. He also has a great passion for developing his own simple scripts for security related problems and learning about new hacking techniques. He knows a great deal about programming languages, as he can write in couple of dozen of them. His passion is also Antivirus bypassing techniques, malware research and operating systems, mainly Linux, Windows and BSD. He also has his own blog available here: http://www.proteansec.com/.

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

Save my name, email, and website in this browser for the next time I comment.

three × 8 =

About InfoSec

At Infosec, we believe knowledge is the most powerful tool in the fight against cybercrime. We provide the best certification and skills development training for IT and security professionals, as well as employee security awareness training and phishing simulations. Learn more at infosecinstitute.com.

Connect with us

Join our newsletter

File download

First Name

Last Name

Work Phone Number

Work Email Address

Job Title

Why Take This Training?

How will you fund your training?

What is your training budget?

InfoSec institute respects your privacy and will never use your personal information for anything other than to notify you of your requested course pricing. We will never sell your information to third parties. You will not be spammed.

Comments

What is Skillset?

Skillset

Practice tests & assessments.

Practice for certification success with the Skillset library of over 100,000 practice test questions. We analyze your responses and can determine when you are ready to sit for the test. Along your journey to exam readiness, we will:

1. Determine which required skills your knowledge is sufficient
2. Which required skills you need to work on
3. Recommend specific skills to practice on next
4. Track your progress towards a certification exam