Introduction to Return Oriented Programming (ROP)

ROP is an exploit technique in which the attacker uses control of the stack to indirectly execute cherry-picked instructions or groups of machine instructions immediately prior to the return instruction in subroutines within the existing program code.Because all the instructions that are executed are from executable memory areas within the original program, this avoids the need for direct code injection, and circumvents most measures that try to prevent the execution of instructions from user-controlled memory.Thus, ROP reuses code in the existing program to perform exploitation, evading memory protection mechanisms. I’ll assume reader is aware of how basic stack overflow exploit works.

Brief revision of classic buffer overflow

Consider the following program

1

2

3

4

5

6

7

8

9

10

11

12

13

#include<unistd.h>

#include<stdio.h>

voidvuln(){

char buffer[10];

read(0,buffer,100);

puts(buffer);

}

intmain(){

vuln();

}

This program is vulnerable to classic buffer overflow attack. In the vul() we have buffer of 10 bytes while we are reading upto 100 bytes in the read() since writing more data than what is allowed, it can lead to buffer overflow.

when vuln() is called, stack might look somewhat like

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

ADDRESS DATA

0xbfff0000 XX XX XX XX <- buffer

0xbfff0004 XX XX XX XX

0xbfff0008 XX XX XX XX

0xbfff000c XX XX XX XX

........

0xbfff0020 YY YY YY YY <- saved EBP address

0xbfff0024 ZZ ZZ ZZ ZZ <- return address

When buffer is filled with just the right size it’s possible to modify saved return address allowing attacker to take control of EIP thus allowing him to execute any arbitrary code.

In modern systems this can be evaded by

ALSR

Stack Canaries

NX/DEP

NX/DEP

DEP stands for data execution prevention, this technique marks areas of memory as non executable. Usually the stack and heap are marked as non executable thus preventing attacker from executing code residing in these regions of memory.

ASLR

ASLR stands for Address Space Layer Randomization. This technique randomizes address of memory where shared libraries , stack and heap are maapped at. This prevent attacker from predicting where to take EIP , since attacker does not knows address of his malicious payload.

Stack Canaries

In this technique compiler places a randomized guard value after stack frame’s local variables and before the saved return address. This guard is checked before function returns if it’s not same then program exits. It can be visualized as

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

ADDRESS DATA

0xbfff0000 XX XX XX XX <- buffer

0xbfff0004 XX XX XX XX

0xbfff0008 XX XX XX XX

0xbfff000c CC CC CC CC <- stack canary

........

0xbfff0020 YY YY YY YY <- saved EBP address

0xbfff0024 ZZ ZZ ZZ ZZ <- return address

If an attacker tries to modify return address, stack canary is also modified inevitably. So, before function returns this canary is checked thus preventing the exploitation.

Return Oritented Programming

ROP is a complex technique that allows us to bypass DEP and ALSR but unfortunately (or fortunately) this cannot bypass stack canary protection however if there’s an additional memory leak it may be possible to predict canary and exploit it.ROP re-uses executable code portions within the binary or shared libraries. These code portions are often called as ‘ROP Gadgets’. We’ll have a look at special case of ROP called as Return2PLT . It should be noted that only libc base address is randomized, offset of a particular function from its base address always remains constant, If we can bypass shared library base address randomization, vulnerable programs can be successfully exploited even when ASLR is turned on.

Let’s consider this vulnerable code

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

#include<stdio.h>

#include<string.h>

#include<unistd.h>

#include<stdlib.h>

voidgrant(){

system("/bin/sh");

}

voidexploitable(){

char buffer[16];

scanf("%s", buffer);

if(strcmp(buffer,"pwned") == 0) grant();

elseputs("Nice try\n");

}

intmain(){

exploitable();

return0;

}

Since we cannot bypass stack canaries using this method, this program should be compiled with flag telling compiler to turn off stack protector

1

$ gcc hack_me_2.c -o hack_me_2 -fno-stack-protector -m32

reading the programs’s memory mapping we can we see that it’s stack is read/write onlyand not executable.

Let’s control the EIP

Since scanf does not performs bound checking , we can control by EIP by overwriting return address of the function to point to some known location. I will try to point it to grant(). we can obtain address of grant using objdump

In exploitable we call grant() using call instruction which does two things, it pushes next address that is 0x0804851b to the stack and changes EIP to the address 0x080484cb which is where grant() is located.

1

2

push %ebp

mov %esp,%ebp

This is function prolouge. It sets up stack frame for current function. It saves base pointer of stack of previous stack frame by pushing it and then changes current base pointer to stack pointer ($ebp = $esp). Now grant() can use it’s stack to store variables and whatnot.After that it allocates space on stack for local variables by subtracting from esp (since stack grows down) and finally pushes address 0x080485e8on the stack before calling system() which is pointer to string which will be passed as argument to system() , It somewhat looks like

1

system(*0x80485e8)

This is called as function calling convention in x86. After system() returns stack is restored using leave which does opposite of function prolouge that is

1

2

esp = esp

pop ebp

finally RET pops the value from top of the stack into EIP which is saved return address of the function

Constructing our own stack frame

We’ve seen how stack behaves when function is called , which means

We can construct our own stack frames

Control parameters to a function we jump to

Decide where this function returns to

If we control the stack between these two we can control return function’s parameters too

Repeating this lets us chain multiple function

from objdump we see that address of “/bin/sh” is 0x080485E0

1

2

3

4

5

6

7

8

9

$ objdump -s -j .rodata hack_me_3

hack_me_3: file format elf32-i386

Contents of section .rodata:

80485d8 03000000 01000200 2f62696e 2f736800 ......../bin/sh.

80485e8 636f7773 61792074 72792061 6761696e cowsay try again

80485f8 00257300 70776e65 64004e69 63652074 .%s.pwned.Nice t

8048608 72790a00 ry..

we’ll return to system() by modifying return address of exploitable() and construct “fake” stack frame for our function system(), the stack will look like

1

2

3

4

5

6

7

8

9

ADDRESS DATA

........

// exploitable() stack

0xbfff0004 80 48 4d 90 <- return address

// our frame

0xbfff0008 41 41 41 41 <- saved return pointer, system()

0xbfff000c 08 04 85 E0 <- "/bin/sh"

So , when exploitable() returns it goes to system() which will see return address as 41414141 and argument as “/bin/sh”, which will spawn a shell but when it returnsit will pop 41414141 to EIP but as expected will segfault but if it were a valid address we can chain them up as long as they do not need parameters. So , to exploit it