by Exaelia on Mon Sep 17, 2012 3:32 pm ([msg=69427]see Sorry If This Is A Stupid Question...[/msg])

If I put this under the wrong topic, I'm really sorry. I saw a topic that said 'how to reverse-engineer.' I don't really understand what that is. I'm a newbie, so I know absolutely nothing. Well, there are new people who went to college... Uh.. never mind. But I have no idea about anything.

by centip3de on Mon Sep 17, 2012 6:54 pm ([msg=69429]see Re: Sorry If This Is A Stupid Question...[/msg])

Exaelia wrote:If I put this under the wrong topic, I'm really sorry. I saw a topic that said 'how to reverse-engineer.' I don't really understand what that is. I'm a newbie, so I know absolutely nothing. Well, there are new people who went to college... Uh.. never mind. But I have no idea about anything.

Alright, I'll only answer this because I love Reverse Engineering... and you seem like a pretty cool guy/gal. Anyway, seeing as you are saying you know nothing, I'll start from nothing.

1. Programming languages:

So, we have these really cool things called programming languages that are needed to get almost anything technological working (phones, computers (Web Browsers (Firefox, Chrome, IE, Safari, Opera, etc.), text editors (Microsoft Word, Notepad, Gedit, VIM, Emacs, etc.), operating systems (Windows, Mac OS X, Linux, etc.), and more), computers in cars, rocket ships, calculators, etc). These programming languages vary in their difficulty to understand, difficulty to learn, what they look like, and how fast they go (yes, I know, I'm leaving something out). Now, there are some programming languages that look almost identical to English except with bad grammar and some funky symbols mixed in (all forms of BASIC). Now, there are also some languages which, to the untrained eye, look like someone took a shit made out of symbols, numbers, and letters in it (PERL, C/C++, BrainFuck, ASM). However, these are the two extremes, there are much easier ones in between that get the best of both worlds. Here's an example of BASIC, and of ASM, so you can see the difference between the two extremes (once again, to the untrained eye):

Alright, so, so far we've covered that everything technological (almost) has to be programmed in order for it to work. And that these programming languages are behind the scenes, carefully making sure that if you hit that button, then it's going to do what it is supposed to do. We've also covered that these languages vary from one another greatly, though they can all do pretty much the same thing. So, how exactly do these programming languages make things work?

Well, my dear, it's actually not that difficult to understand. However, you must first be able to understand something really fancy called an "abstraction layer". Basically, abstraction is making one thing easier by making a "layer" (so to speak) that takes whatever you put in, and turns it into the really hard stuff, for instance; instead of having to manually turn your radio on by connecting the wires together and holding them, each time you want to listen to music, you just push a button that does that for you. This, in and of itself is an abstraction layer. The button is abstracting the difficult task of holding the wires together, thus making it easier for you to turn and keep a radio on. Got it? Good.

Alrighty, so, in programming we have many, many, many abstraction layers. At the very top of all of these programming abstraction layers, it's actually pretty easy to use and very human friendly, because all of the hard stuff is "abstracted" (catching on yet?). But, at the bottom of all of the abstraction layers, (i.e. where there are no abstraction layers) we find binary. Yes, the cliche 0's and 1's that actually make a computer tick. One abstraction level up from binary is machine code. This is a language that all machine's understand (though each machine has a different version of it), that abstracts the binary, making it easier to program in, but it's still pretty tough. One layer up from Machine Code is Assembly code, otherwise known as ASM. It abstracts the machine code, making it EVEN easier to program (though it's still a pain in the ass). Now, above ASM are all the more well known languages, such as C, that making programming a much, much easier task (and now a-days, even more languages are abstracted above C). Got it? Cool.

So, now that we get the abstraction layers in programming (somewhat), it would be much easier for me to transition into how these all work together. We already know that at the bottom of the pile is Binary (1's and 0's), Binary is what makes a computer tick. Essentially, any program you're running right now, is actually running in Binary at the very bottom level of it all. But, how to we get from the top layer, down? Well, between each layer there is some translating down (imagine trying to talk to someone of a different language, you would need a translator, right? In our case, the layers are translating into the unabstracted (i.e. the harder to program in) code). Imagine the abstraction layer stack we made is collapsing, i.e. BASIC (the easy language) is translating into C (the slightly more difficult language) which is translating into Assembly (the even harder yet language) which is translating into Machine Code (the bitch of a language) which is translating into Binary (the 1's and 0's). This process is done by a special program called a compiler. It is the thing that translates all of this mess for us, into Assembly Code (pretty close to the bottom of the layers), then a program called an Assembler turns that assembly code into Machine Code, which the computer can now use, thus, running our program. (Complicated, right?)

3. Reverse Engineering.

Now, that we understand how each language is translated down into the abstraction layers (or stack), we can begin to realize that if we want to see the code of a program (and no, we don't have the source code (the original copy of the code before it was translated down), that has already been compiled (translated down the abstraction layers), we're going to need another special program, this is where OllyDBG and IDA Pro step in. These programs break apart the now running program we've made, and show us the raw data of it. However, instead of showing us the Binary (because fuck binary), they make it as user friendly as possible and show us the Assembly code (anything past this and it would become a real language (there are programs that can sort of do this, but since we don't know what language the program was originally written in, it usually doesn't work)). By using these programs, we can change the assembly code and reassemble (the program that turns assembly code into Machine Code) the program, thus changing how the program works.

What is the use of this? By Reverse Engineering, people can crack programs, thus getting them for free, or, on a more legal note, can see why the program is crashing. They can see the assembly code as the program operates, find the bug, and squash it's little head. Or, they can reverse engineer virii and other malware, and see how they work, so that we can patch all of the security holes and stop them from attacking our systems again.

Shit that's a long post... Hope you understand it!

Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning. -Rick Cook

by WallShadow on Mon Sep 17, 2012 8:08 pm ([msg=69433]see Re: Sorry If This Is A Stupid Question...[/msg])

To add on to centipede's wonderful and detailed post, here is an example;

This is the java source code of a program that prints 'Hello world!!!'. The computer can't run this program because it written in our language, Java, and not the horribly compacted Java byte code that the computer can read.

We can then take this program that we have written and give it to the compiler which will compress and simplify everything beyond belief. The compiler will compile the code into Java byte code. This is what our hello world program looks like in byte code:

note that the many periods and spaces aren't periods and spaces, but characters which our (or mine at least) browsers can read, so they display it as the next best thing. This example isn't very hard to read, but as you get more complex, the byte code becomes virtually indecipherable. Despite this, the computer has no trouble reading it and is easily able to execute it.

Now this is where the decompiling comes in. Lets say that i release this program to everyone, and everyone loves it and gives me money for it, but I never release the source code, so no one knows exactly how it works. All of a sudden, Steve Jobs, the copy cat that he is, also wants to earn money off of this. He'll use a decompiler program to take apart my byte code program and rewrite it in a way that we can read it as Java code. It won't be 100% accurate, but it will be accurate enough to figure out how it works. Thus, he will rebuild it, and sell it for some outrageous price and people would love it and buy it even more than they did from me... thank god that he's dead.

This is how our HelloWorld program looks after I used Java Decompiler to decompile it:

by xsvMix on Tue Sep 18, 2012 4:28 pm ([msg=69452]see Re: Sorry If This Is A Stupid Question...[/msg])

I am no expert in any way, but to put this in a very short and simple format... It involves changing the way a program works and making it do things it probably wasn't originally designed to do. Hope it helps you

by Exaelia on Tue Sep 18, 2012 4:41 pm ([msg=69453]see Re: Sorry If This Is A Stupid Question...[/msg])

xsvMix wrote:I am no expert in any way, but to put this in a very short and simple format... It involves changing the way a program works and making it do things it probably wasn't originally designed to do. Hope it helps you

Also, impressive post centip3de

Oh. And I agree, it was so impressive my stupid brain couldn't handle it. Sorry if I'm trolling... I hope I haven't irritated anyone with my question.

by WallShadow on Tue Sep 18, 2012 5:59 pm ([msg=69455]see Re: Sorry If This Is A Stupid Question...[/msg])

Exaelia wrote:Oh. And I agree, it was so impressive my stupid brain couldn't handle it. Sorry if I'm trolling... I hope I haven't irritated anyone with my question.

No one gets annoyed with these questions, in fact, there are probably at least a dozen more people here that had the same question but didn't want to ask it. The only questions that irritate everyone are questions like 'how 2 hack facebook?' or 'how 2 hack friends computer?'.

by Exaelia on Tue Sep 18, 2012 6:03 pm ([msg=69456]see Re: Sorry If This Is A Stupid Question...[/msg])

WallShadow wrote:

Exaelia wrote:Oh. And I agree, it was so impressive my stupid brain couldn't handle it. Sorry if I'm trolling... I hope I haven't irritated anyone with my question.

No one gets annoyed with these questions, in fact, there are probably at least a dozen more people here that had the same question but didn't want to ask it. The only questions that irritate everyone are questions like 'how 2 hack facebook?' or 'how 2 hack friends computer?'.

by centip3de on Wed Sep 19, 2012 12:12 am ([msg=69459]see Re: Sorry If This Is A Stupid Question...[/msg])

WallShadow wrote:To add on to centipede's wonderful and detailed post, here is an example;

This is the java source code of a program that prints 'Hello world!!!'. The computer can't run this program because it written in our language, Java, and not the horribly compacted Java byte code that the computer can read.

...

As you can see, we have basically the same exact thing except that the java compiler has added code for the 'import java.io.PrintStream;' line and has erased much of my precious whitespace.

-WallShadow <3

Thanks, I only wrote it in about 10 minutes, so I'm glad I did so well. Anyway, what I think I covered in my post, was that these programs that attempt to reverse the program, and translate that ASM into workable code is that if you just have an executable, you don't know the native language. So, if I were to write my program in pure ASM and someone tried to convert it to Java, things would go wrong. Granted, there are certain things you could look for in the disassembled code such as what libraries it's calling, or what it's linked against (though you may need the original source code to do this (or to have compiled it with debugging in mind (the -g argument in gcc))), but many times it can get confusing, fast.

xsvMix wrote: impressive post centip3de

Thanks!

Exaelia wrote:Oh. And I agree, it was so impressive my stupid brain couldn't handle it. Sorry if I'm trolling... I hope I haven't irritated anyone with my question.

Is there anything that we can help you understand?

Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning. -Rick Cook

by limdis on Wed Sep 19, 2012 9:32 am ([msg=69464]see Re: Sorry If This Is A Stupid Question...[/msg])

This thread right here is why I absolutely love HTS. Exaelia, stick with it. Don't ever stop asking questions! Once the pieces begin to fall into place and the clarity kicks in the feeling is beyond rewarding.

Cent; this reminded me about your mention of making some ASM videos. Did you scrap that project or is it still in the works?

"The quieter you become, the more you are able to hear...""Drink all the booze, hack all the things."