Basic Introduction To Reverse Engineering

This is aimed at those who struggle with the application challenges. There is a lot of explanation of code, massively over the top for a simple crackme but its an old article I wrote a while ago. Maybe someone will find it useful to start them off.

This is an article I wrote about 2.5 years ago. The original is posted over at osix.net, a great site for programming challenges, as opposed to security challenges. I\'ve updated it slightly to remove some confusing parts.

I\'m posting this here because it seems many people are blindly dashing their way through the Application challenges without really understanding them. Rather than provide any solutions to HBH Challenges, I\'m going to use another crackme instead. The crackme I\'ll be using were written by Cruehead and provide an excellent introduction to reversing which should help many people get started.

If you need to find the exe file then go here : http://rapidshare.com/files/46136191/Cruehead_Crackmes.rar.html (9kb)
I\'ve included all of Cruehead\'s crackmes so you can try the other 2 after the tutorial

---------------------
Section 1 - The Tools
---------------------

Obviously, you dont need anything illegal to do this as it can be easily done using legitimate, free software. The only tool you\'ll need is a debugger - I\'ll be using Ollydbg (and I\'d recommend it to anyone else starting out). Its available for free download. Additionally, I always find its helpful to have a pad and pencil to make notes. Many of these routines can get quite complex and, unless you\'re experienced, you\'ll need to write things down to remember them.

Ok, so you should have downloaded the crackme and have Ollydebug installed. First thing to do is close this tutorial and have a play around. See what you can find and get a feel for the program. The very least this will do is teach you how to use basic Ollydebug functions. No cheating now ;-)

Done? Well maybe you suprised yourself and found things you thought you\'d never find? Maybe you found nothing and reckon you just wasted 30 minutes? Either way, I\'ll go through the process I used to reverse this and hopefully it will teach you a few things.

Okay, so run the crackme and lets have a look around. Well, theres not much to see but we can find a \'Register\' box. Enter a user name into the box and a random username. You\'ll get a message saying \'No luck there mate\' (incidentally, if you do happen to guess your serial and get the \'Congratulations\' message, I recommend that you buy a lottery ticket today). So we know what we need to do; we need to find the serial - at this point we dont know if its a hard coded number or if its generated from the username but thats part of the fun!

Okay, so open Olly and select Crackme1.exe. You\'ll then be presented with the workings of the application, starting about here :

Now, we know that the Crackme is taking whatever we typed and checking it against the correct serial. We therefore need Olly to intercept any calls this crackme makes where it could be reading what we typed from the username and serial boxes. There are a few ways windows does this - its beyond the scope of this article to teach you the depths - but I will tell you that one of them if using the call \'GetDlgItemTextA\'.

So, what we need to do is make sure that if the Crackme makes this call, Olly intercepts it and breaks for us so that we can follow what is being done with the information. Thats easy enough. If you press Ctrl-N (or right click and select \'Search for\' followed by \'name (label) in current module\') you are presented with a list of calls made by the crackme. You can then right click on GetDlgItemTextA and select \'set breakpoint on every reference\'.

We\'re ready to go. Press F9 and Olly will run the crackme, presenting you with its user interface. Go to the registration box and enter a name and any serial. I\'m using \"FaTaLPrIdE\" and \"123456\". Press the register button and Olly should break here :

Now, this is the first reference to the call \'GetDlgItemTextA\' so we know our name or serial is going to be read in. If you read the top of you Olly window, it should say [CPU - main thread, module Crackme1]. This is important as when this says Kernel or User32, we know we can keeping stepping as we are not inside the crackme\'s code - we are only interested in the Crackme.

Press F8 to step over the program and try to get a feel for what is going on. There are 2 ways of exploring. If you leave breakpoints as they are, pressing F8 will break at the jump table here :
004014D0 JMP DWORD PTR DS:[<&USER32.GetDlgItemTextA>]
You then need to keep stepping through the User32 code which becomes rather long winded. An easier way is to (once you have broken) press ALT-B to bring up the breakpoint window and remove the breakpoint at 4014D0 before stepping. This allows you to step through the program code without going through USER32/Kernel etc

This is where the fun begins. Olly even helps show us we\'re in the right place by showing that our entered username and password are pushed to the stack before calls are made and a compare is made shortly afterwards.

For now, press Ctrl-N, select \'GetDlgItemTextA\' and press \'remove all breakpoints\'. Then select the line 00401223 and press F2 to put a new breakpoint here. What this means is that you can now come back here whenever you run the program without stepping through all the previous steps we have taken. You dont want to search for this again if you press a wrong button somewhere!

So, we probably know how we could get the congrats message - a flick of the Z bit at 00401241 or simple patch of the JE at 00401243 should do it. But that doesn\'t teach us much, we want to know exactly what this crackme is doing in order to test our username and serial. Our job is to trace the calls at 0040122D and 00401238 to find out exactly what is going on here.

You should still be at 00401223. We want to investigate the first call so press F8 until you highlight the following row:
0040122D . E8 4C010000 CALL Crackme1.0040137E

Now press F7. The difference between F7 and F8 is that F8 steps over calls and F7 steps into them. In other words, if a call is of no interest to you, you can press F8 to step over it and carry on. If you think that it might contain some vital information, press F7 to step into it and you can look at it in detail.

Okay, so we see at 0040137E that our username is loaded into ESI ready for processing. The first character of our username (F in my case) is then moved into AL before being tested to see if it is 0. Then the interesting stuff starts - at 00401389 the F is compared with 41. A strange comparison you might think?

Open up a browser window and go to www.asciitable.com and you\'ll get a better understanding. The processor deals with character values in hex i.e. next to my F in Olly is the number 46. If you look at the ASCII table you will see that 46 is the hexadecimal representation of \'F\' and 41 is the representation of \'A\'. What the line at 00401389 is doing then, is its taking the first letter of our username and comparing it with A. The result of this comparison effects what happens at the jump on the next line (0040138B) as if the first letter of our name is less than A (see the ASCII table) it jumps elsewhere. My F is above A though so we continue to 0040138D.

Here a similar operation is performed. A quick look at our ASCII values shows us that our character is now being compared with Z - this time a jump is taken if the value is above Z. Obviously, my F is fine and we continue.

At 00401399 ESI is incremented before a jump is taken back to 00401383. If you remember, our username is stored in ESI so this has essentially just moved us to the next letter of our username and gone back to the beginning of this routine. My second letter is \'a\' so lets see how this is dealt with.

Well, stepping through it passes the comparison with \'A\' as 61 is indeed greater than 41(A). When we get to the comparison with Z though, it fails and the jump is taken at 0040138F to 00401394. This is because, as the table shows, a(61) is greater than Z(5A).

So whats happening here? Our character is in AL and gets 20 subtracted from it. Whats this for? Check out the ASCII table.... you will see that my \'a\' is 20 values higher than \'A\' i.e. a-20=A; this sub routine has just capitalised my character! It then jumps back to the routine, increments ESI to the next letter and continues.

Step through the rest of the routine and you\'ll notice that your entire username is processed to make sure its uppercase. Thats all this bit is doing. My username is now FATALPRIDE.

A couple of points to note though are that if you only used uppercase letters anyway, this routine is redundant and you wont even see the SUB AL,20 part. Also, if you have non alphabetic characters in there, they\'ll be taken down 20 values too as they obviously are not between A and Z.

Once the last letter of your username has been processed, the TEST AL,AL will fail and the application jumps out of this loop to 0040139C where your newly capitalised name is popped from the stack to ESI.

Then comes this line:
0040139D |. E8 20000000 CALL Crackme1.004013C2

Press F7 to trace this call - this is the second routine. Setting a breakpoint here may be useful too!

So whats happening here? Well firstly EDI and EBX are XOR\'d with themselves - you\'ve passed enough challenges to know that this always returns a 0 result hence this is just a way of clearing both EDI and EBX.

Then a similar thing happens to what happened in the above routine - the only difference being that the first letter of our capitalised username is move to BL rather than AL. Its then tested incase its 0 before landing at 004013CC.

If you\'ve read Trope\'s articles, you\'ll know that BL (where our character is stored) is just the lower memory in EBX. Hence ADD EDI,EBX is taking the value of that character and adding it to EDI - obviously, we just zero\'d EDI so for the first letter, its added to 0. We then increment to the next letter of our username and the process is repeated although notice that the loop does not include the XOR functions each time. This basically has the effect of adding all the values of our username together and storing it in EDI. For my username I get this :

When we step over this, it takes us back to the end of the first routine, to where the second routine was called from. We land here :
004013A2 |. 81F7 78560000 XOR EDI,5678
004013A8 |. 8BC7 MOV EAX,EDI

Okay, so here we have another XOR statement - this time the contents of EDI are XOR\'d with \'5678\'. We know that EDI contains our summed username so in my case, this equation is :

02DC XOR 5678 - the result is stored in EDI again (54A4 in my case) before the next statement moves it to EAX. We then jump back to the initial code we looked at in section 2.

The difference is that we have now completed the call at 0040122D and we\'re now at 00401232 waiting to continue. Congratulations you\'ve just traced your first call and now you understand exactly how this applications processes a username! Now see if you can follow the same procedure for the second call below! Trace into it with F7 and see what you can find...... set a break point first so that if you mess up you can try again or pick this guide up where you left off!

Firstly we see EAX is pushed to the stack (we know that this contains our summed username XOR\'d with 5678 from the previous call) and then our entered serial (123456) is pushed to the stack too. We can then use F7 to trace our second call. We land here :

So you should be at the beginning of the loop at 004013E2. Lets try and work out whats going on here. Firstly, 0A (10) is moved into AL and then the first character of our serial (1 in my case) is moved into BL before being tested for 0 in the usual way. Note though that EBX contains 31 rather than 1 i.e. the hexadecimal representation of the character 1.

After this, 30 is subtracted from our number i.e. 31-30 in my case. Then EAX and EDI are multiplied and our processed character added to the result. This is then stored in EDI.

In other words, EDI holds (31-30) + (10x0) = 1 ; after one iteration on my serial. The process is then repeated but this time, remember that EDI is no longer 0 so when EDI is multiplied by EAX, we get a different result. i.e.

1 (previous iteration) + ( (32-30) + (10x1) ) = 0C

Continue this trough the rest of your serial and we get a final result (1e240 in my case). Actually, what this has done is to convert our serial to hex!

So we jump out of the loop and land at 004013F5. This is interesting - remember in the last call where the username was uppercased and XOR\'d with 5678h? Well here we\'ve just hexed the serial and now we\'re XORing it with 1234h (result is 1f074 in my case)!

Simple really! The result is then moved from EDI to EBX and we jump back to our initial piece of code again!

The first line is a quick stack cleanup which then leaves our processed username value (54A4 in my case) on the top of the stack. This is then popped to EAX.

Then comes the critical comparison :
00401241 . 3BC3 CMP EAX,EBX

EAX (the result of our username being processed) and EBX are compared - the two values should look familiar as they are the results of our two calls i.e. in my case they are 54A4 and 1f074.

The next jump statement is the critical one - if the two values in EAX and EBX are equal, we jump to the call statement at the bottom of the above code extract.... this is our success box! (Hence the reason I said we could patch this jump to jump if not equal rather than if equal). If EAX and EBX are not equal, we dont jump and we are taken down the \'No luck there mate\' routine - this is where I go on this occasion as 123456 is not my correct serial.

So, we have found that the crucial operation is a comparison of our processed username and our processed serial. Specifically, our processed serial give the same result as our processed username in order to be valid. So how do we achieve this?

Well, this is where knowledge of the XOR function brings us through. We know that :
if A XOR B = C
then C XOR B = A.

So how is this useful?
Well, looking at the way the serial is processed, our entered serial in hex XOR with 1234 must equal our processed username (in my case 54A4). Using the above reasoning then, our serial is our processed username XOR with 1234 i.e. (for me)

If you like this, just pop a comment below and let me know. Similarly, if you have a criticism or improvement, I\'d like to hear it too. Please don\'t tell me it was too simple though as that was the point of the article - to explain as much as I could for those who have never used a debugger before.

Deadataon July 31 2007 - 17:00:42Wow, excellent article, written in proper english too.

spywareon July 31 2007 - 17:22:52I'll be saving this on my harddisk. Thanks Fatal_Pride, I can use this very well.

FaTaL_PrIdEon July 31 2007 - 17:38:56Thanks for reading guys. Glad it was helpful

LanceUppercuton July 31 2007 - 18:55:43Fantastic Tutorial...definitely a great read

FaTaL_PrIdEon August 09 2007 - 09:19:59Thanks LanceUppercut
Would the person who rated it poor please leave a comment. I'd be interested to hear of any suggestions for improvement. Thanks.

sp00kyon August 09 2007 - 19:10:14Awesome! This is just the kind of article I've been looking for FOREVER, but up until now I'd been unable to find one that made any kind of sense to me. Excellent job! =)

a-hackon August 09 2007 - 20:01:11Sorry I accidentaly rated it poor because on another site I frequent the rating system starts with poor and awesome is the last rating on the bottom. lol I fixed it now.

I'll be using Ollydbg (and I'd recommend it to anyone else starting out).

IDA Pro also isn't bad.

exidouson October 18 2007 - 05:53:02This was a AWSOME article.... I was able to follow it ( And understand most of it ) ** I am still confused on some of it tho, But i think that that will change the more i play with it! ** BTW: Thanks for explaining the jumps, I did not know what those were for before... I do have a couple of noob questions for ya tho.. Pm me if you are willing to answer them... Thanks, Exidous