Introduction

Hey everyone! Today, we’re going to keep moving forward with our shellcode analysis work. Like last time, instead of writing our own assembly code, we’re instead going to analyze the work of someone else.

Requirements

Take at least 3 shellcode samples created using msfvenom for x86 Linux

Use GDB, ndisasm, and/or libemu to dissect the functionality of the shellcode

Present your analysis

As we said before, first off, this isn’t going to be the complete assignment 5, instead, this is going to discuss the second of the three shellcode samples generated using msfvenom for x86 Linux. I’ll be creating another article to discuss the final payload. With that in mind though, we can definitely meet the other two goals.

Our shellcode

So to get started, I had to choose what shellcode I was going to analyze. After looking through msfvenom’s payloads, I found one which seemed like it’d be interesting — the linux/x86/read_file payload by hal.

I think this one will be interesting to review as I wanted to focus on topics which we didn’t explicitly cover in the SLAE course, such as reading file contents using NASM.

If we take a look at what options we have, we see there are a fair number of them:

Looking at these, we do have a number of advanced options which we won’t be working with. Instead, we’ll focus on FD (the file descriptor to write to) and the path (path of the file which we’ll dump the contents of).

In our case, we’ll stick to FD 1 (stdout) for our file descriptor. This will simplify the setup required to analyze the shellcode. We do need to set the path though. We’ll dump /etc/passwd, as that’s often a good starting point when we are trying to get access to a machine.

Let’s dig into our shellcode though!

Libemu

We’ll start off by analyzing the payload in libemu, which provides the sctest binary for analyzing what a payload does. We’re not going to cover the options, as we previously did in the first shellcode analysis document. But the basics of it is that we’re enabling verbose mode, reading the payload to analyze from stdin, and iterating through up to 10000 steps. If we’re lucky, this will give us pseudocode to start our analysis with.

Sadly, like last time, we don’t get any pseudocode. If we take a look at the execution graph, hopefully we’ll get a bit more. Realistically, I expect we’ll need to drop into ndisasm and walk through the assembly at a lower level to really analyze this.

There we go, with a reasonably solid background in assembly, that should be easier for us to understand.

First function

Unlike the shell_find_tag payload we worked with in our last analysis, this one works a bit differently. We don’t just immediately begin our first function. Let’s dig into what we mean by looking at a subset of the instructions:

If you’ve done much shellcoding before, you should recognize this JMP, CALL, POP sequence which we use to retrieve where an item (like the path to our file) is located in memory without relying on hard coded values.

First, we take a short jump from 00000000 to 00000038, we then call 0x2 which brings us up to the mov eax, 0x5 instruction and pushes 0000003D onto the stack. We can see this in GDB:

You can see here both the assembly before we execute the call. We then use stepi to step into the function, and push the next instruction onto the stack as our return address. We then examine this, and verify that this is in fact what happened. We have our das instruction on the stack for our return.

Now that we have this on the stack and we’re back at 00000002, we then move 0x5 into eax. This is setting up our first function, SYS_OPEN.

int open(const char *pathname, int flags);

We then pop the return address into EBX, giving us a pointer to our PATH variable from msfvenom in EBX.

We then XOR ECX so that it’s 0x0 which is the value of the O_RDONLY flag. We then trigger an interrupt so that we call our function.

This leaves our registers look like so after each instruction:

Address

Instruction

EAX

EBX

ECX

EDX

EDI

00000002

mov eax,0x5

0x5

Unknown

Unknown

Unknown

Unknown

00000007

pop ebx

0x5

0x8048091

Unknown

Unknown

Unknown

00000008

xor ecx,ecx

0x5

0x8048091

0x0

Unknown

Unknown

0000000A

int 0x80

0x3

0x8048091

0x0

Unknown

Unknown

You’ll notice how after we trigger the interrupt, our value in EAX changed from 0x5 to 0x3, which is the return value of our call to open. We can read the open manpage for more information about the return value:

The return value of open() is a file descriptor, a small, nonnegative
integer that is used in subsequent system calls (read(2), write(2),
lseek(2), fcntl(2), etc.) to refer to the open file. The file
descriptor returned by a successful call will be the lowest-numbered
file descriptor not currently open for the process.

So we successfully called open! This is a good start.

Second Function

Now that we have our file descriptor, we can start the second function:

We first move the opened file descriptor value from EAX into EBX, as it’ll be a function argument for the second function. We then move 0x3 into EAX for our function. 0x3 is the value representing the SYS_READ system call.

ssize_t read(int fd, void *buf, size_t count);

So we already have int fd covered by putting the file descriptor in EBX. We then move the address ESP is pointing to into EDI and subsequently move it into ECX giving us a pointer to our buffer. And finally we move 0x1000 into EDX as our size_t value and trigger our interrupt.

Address

Instruction

EAX

EBX

ECX

EDX

EDI

00000002

mov eax,0x5

0x5

Unknown

Unknown

Unknown

Unknown

00000007

pop ebx

0x5

0x8048091

Unknown

Unknown

Unknown

00000008

xor ecx,ecx

0x5

0x8048091

0x0

Unknown

Unknown

0000000A

int 0x80

0x3

0x8048091

0x0

Unknown

Unknown

0000000C

mov ebx,eax

0x3

0x3

0x0

Unknown

Unknown

0000000E

mov eax,0x3

0x3

0x3

0x0

Unknown

Unknown

00000013

mov edi,esp

0x3

0x3

0x0

Unknown

0xffffd190

00000015

mov ecx,edi

0x3

0x3

0xffffd190

Unknown

0xffffd190

00000017

mov edx,0x1000

0x3

0x3

0xffffd190

0x1000

0xffffd190

0000001C

int 0x80

0xd3a

0x3

0xffffd190

0x1000

0xffffd190

With our function called, we see that EAX holds a non-zero value. In this case, it’s the number of bytes which were read. In our case, 3386 decimal or 0xd3a hex.

Third Function

This one is nice and short. We move the length of the file we read into EDX, move 0x4 (sys_write system call) into EAX, and then 0x1 into EBX which is the FD variable we passed to msfvenom. In our case, that’s stdout. We then trigger the write function to write to stdout.

Address

Instruction

EAX

EBX

ECX

EDX

EDI

00000002

mov eax,0x5

0x5

Unknown

Unknown

Unknown

Unknown

00000007

pop ebx

0x5

0x8048091

Unknown

Unknown

Unknown

00000008

xor ecx,ecx

0x5

0x8048091

0x0

Unknown

Unknown

0000000A

int 0x80

0x3

0x8048091

0x0

Unknown

Unknown

0000000C

mov ebx,eax

0x3

0x3

0x0

Unknown

Unknown

0000000E

mov eax,0x3

0x3

0x3

0x0

Unknown

Unknown

00000013

mov edi,esp

0x3

0x3

0x0

Unknown

0xffffd190

00000015

mov ecx,edi

0x3

0x3

0xffffd190

Unknown

0xffffd190

00000017

mov edx,0x1000

0x3

0x3

0xffffd190

0x1000

0xffffd190

0000001C

int 0x80

0xd3a

0x3

0xffffd190

0x1000

0xffffd190

0000001E

mov edx,eax

0xd3a

0x3

0xffffd190

0xd3a

0xffffd190

00000020

mov eax,0x4

0x4

0x3

0xffffd190

0xd3a

0xffffd190

00000025

mov ebx,0x1

0x4

0x1

0xffffd190

0xd3a

0xffffd190

0000002A

int 0x80

0xd3a

0x1

0xffffd190

0xd3a

0xffffd190

This returns into EAX the number of bytes that were written out to the file descriptor, which in our case is the full file.

Author Kevin Kirsche

Kevin is a Principal Security Architect with Verizon. He holds the OSCP, OSWP, OSCE, and SLAE certifications. He is interested in learning more about building exploits and advanced penetration testing concepts.