Edit 2014-01-22: See also this blog post which is
a fork of Taminoo project but this time using Pin.

Introduction

Last summer, with my friends Ahmed Bougacha and Pierre Collet, we worked on a
personal project called Taminoo. Basically, Taminoo is a constraint path solver using Valgrind and
Z3. At first, we didn't plan to release it because it was just a PoC. But we hope it will
give ideas to our readers! In this blog entry I will try to explain how we built Taminoo :).

Concolic execution

Concolic execution is a technic that uses both symbolic and concrete execution to solve a constraint path.
For example, when you want to fuzz a target binary, you do want to maximize the code coverage. Imagine
the following code:

The user can control both "a" and "mod" arguments. But if you do dumb-fuzzing on those inputs, you will rarely
match the different constraints to trigger the "MOD_VULN" code path. And you will miss a potential exploitable bug.

With the concolic execution, we first run the program with a = 0 and mod = 0. When we hit the line "if (a == 1234 )"
we save the constant value "1234" somewhere and we re-run the target with a = 1234 and mod = 0. We also do
that with the switch case. At the end of the process, you got a tree full of your constraints. If you want
to maximize the code coverage, you have to try to take every code path possible.

If you want to go further in the concolic execution world, you can read the dedicated page
on wikipedia.

Our approach

First of all, Valgrind is a Dynamic Binary Instrumentation framework: you can easily build dynamic analysis tools.
It also uses a RISC-like intermediate language called VEX. We used Valgrind to taint the memory controled by the
user and to track the data flow. The second part is based on Z3, a high-performance theorem prover developed at
Microsoft Research. Z3 is used to solve all the constraints from our VEX's output.

As we said earlier, Taminoo's goal is to find a way to have the "Good boy" message.
Taminoo's process is made of two parts:

Taint the user inputs with Valgrind and VEX

Solve the constraints with z3

If we wanted to do a complete PoC we would have tainted all the user input like: the environment variable,
the syscalls, etc. But as you can see in the previous code, the user input is data read in a file called
serial.txt. Then we just need to taint the memory where the read syscall stores the bytes read.

Catching the system calls is very straightforward to do with Valgrind. Its API allows us to do pre and post operations.

Then, to propagate correctly the taints, we instrument each instruction of the binary. If it is
a LOAD, PUT or STORE instruction we spread the taints. For example, imagine we have three variables
- a, b and c - the variable a is tainted. When b = a and c = b the b and c will also be tainted
because they can be controlled via a.

What this output means ? Really straightforward actually! For each action, we assign an ID and they depend
on the previous action ID. First, we read and taints the input string in memory. Then we can see the action
ID #6: #6:32 = Xor32(#5:32,0x55). That means this action depends on the #5 action, but #5 depends on #1 which
is a simple read on the file descriptor 4 at the offset 0. After that, we can see the action #13 compares the
action #12 with the constant value 0x30, and it also tells us the comparaison is false.

To be more clear, here is the constraint chain:

The first constraint can be written like this:

CmpEQ8(Xor32(Read(4,0),0x55),0x30)

Now, it's time to introduce Z3. To do the concolic execution we have a Python script which runs Valgrind and parses
the VEX output. Then, it converts all the different constraints in Z3 patterns. It solves the first equation,
and will re-run Valgrind to solve the second equation, and so on until the target displays the "Good boy" message.
You can find the dump of all constraints here.

Below, an example for the first equation CmpEQ8(Xor32(Read(4,0),0x55),0x30) translated in z3py:

As you can see the concolic execution can be really fun and efficient. We are really sorry to not
release the sources, because the code really sucks and it wasn't a serious project. If you want to
see a real project using concolic execution, please read the Fuzzgrind
paper which inspired us a lot.

However we will give share our sources with Axel "0vercl0k" Souchet who want to plays with it.
We hope he will, at least, clean the sources and will release a forked project.

PS: Just for fun, here are some pictures of the Taminoo mascot :)
IMG1IMG2

Edit 2014-01-22: See also this blog post which is
a fork of Taminoo project but this time using Pin.