Reverse Engineering Stack Exchange is a question and answer site for researchers and developers who explore the principles of a system through analysis of its structure, function, and operation. Join them; it only takes a minute:

I find that more and more often binaries are being packed with exe protectors such as upx, aspack etc. I tried to follow a few tutorials on how to unpack them but the examples are often quite easy while my targets are not.

I am looking for good resources and any hints/tips on how to unpack targets.

The lena tutorials will work through a lot of the techniques you will see in many packers. There are some tutorials that focus more on cracking, but those generally have good information as well.
– amccormackMar 23 '13 at 5:30

3 Answers
3

This is not difficult with simple packers but might be tricky with the more advanced ones. They may employ timing checks (rdtsc), exception-based control transfer, using debug registers for calculations etc. Using a VM or an emulator here usually helps against most of them.

2. Find the original entry point (OEP)

There are many ways to do this. Sometimes the jump to OEP is obvious when it follows a chunk of looping code and there's nothing reasonable-looking after it. Or you may recognize the code at OEP if you're familiar with the entrypoints produced by different compilers. A couple other tricks:

if the packer saves the original registers before unpacking, set a hardware breakpoint on their location in the stack - this way you'll break right when they're restored before jumping to OEP.

if during tracing you can identify memory where the unpacked code is being written, set a page execution breakpoint on that memory range - it will trigger after the jump. IDA allows you to set such a breakpoint, and I think OllyDbg too.

set breakpoints on common APIs used by startup code, e.g. GetCommandLine or GetVersionEx. This won't get you the exact OEP, but you can usually go back the callstack and find it more or less easily.

3. Dump the unpacked code

If you're using IDA, you don't actually need to dump the file into a separate file - it's enough to take a memory snapshot that would copy the bytes from memory to the database so you can analyze them later. One thing to keep in mind here is that if the packer used dynamically allocated memory, you need to mark it as "loader" so it gets included in the snapshot. More here.

4. Restore imports

I'm not very familiar how it's done in Olly or other debugger, but AFAIK you need to use a tool like ImpREC on your dump and a copy of the process in memory.

It's somewhat simpler (IMO) in IDA. You just need to find the import table and rename the pointers according to the functions they are currently pointing to (this should be done while debugger is active). You can use either renimp.idc script or UUNP "manual reconstruct feature" (see here).

For finding import table there are two tricks I sometimes use:

follow some calls in the startup code at OEP to find external APIs and this should lead you to the import table. Usually the start and the end of the table is obvious.

during unpacking, set a breakpoint on GetProcAddress and see where the results are written. This however won't work with packers that use manual import resultion using the export directory. Putting a read BP on kernel32's export table might help here.

5. Clean up

This is optional but it may be useful to remove the remains of the packer code that would only distract you. In IDA, you should also apply a compiler FLIRT signature if you recognize the compiler used.

6. Making an unpacked executable

I don't do this step as I rarely need to run the unpacked file but in general you usually need to fix up the PE header so that offsets to the section's code in file match those in the dump.

Now, there are many variations and tricks not covered by the above steps. For example, some packers don't fully resolve imports initially but put jumps to stubs that resolve import on first call and then patch it so it goes directly to the target next time. Then there is "stolen code" approach which makes it harder to find and recover OEP. Sometimes the packer runs a copy of itself and debugs it, so that you can't attach your own debugger to it (this can be solved by using emulator or a debugger that doesn't use debugging APIs like Intel PIN). Still, the outlined steps can cover quite a lot of what's out there.

good answer (+1) and undoubtedly IDA plays a big role in general in RCE, but I think you shouldn't limit your answer to just IDA (yeah, I saw the mentioning of ImpRec and OllyDbg).
– 0xC0000022L♦Mar 22 '13 at 22:54

@0xC0000022L: I am unfortunately not familiar with unpacking in OllyDbg, I only know of it in theory. But I think most of my answer can be used with any debugger (in fact, I wouldn't say it's "limited" to IDA at all). You could add your own answer specifically about unpacking in OllyDbg, though!
– Igor Skochinsky♦Mar 22 '13 at 23:26

@IgorSkochinsky I'm actually very happy that you covered IDA here because frankly, there is a ton of info on how to do this in Olly/x64 on Tuts4You and elsewhere, but not much on how to do this in IDA Pro. I am very thankful for this as I learned an entirely new way to handle this problem completely in IDA Pro. Do you have any more IDA Pro suggestions to solve this problem newer than this post (plugins/blogs/etc)? Thank you Igor.
– the_endianNov 22 '17 at 9:31

Igor's answer is very good. However, the outlined techniques rely on the assumption that at some point the executable is unpacked in memory. This is not always true. Virtualization obfusactors compile the original binary into a custom instruction set when is executed by an simulator at runtime. If you encounter a binary obfuscated in this way you have no choice but to write a disassembler from the custom instruction set to an instruction set that you understand.

Well, technically even the VM protectors leave the file unpacked in memory, i.e. they still work as "wrapping packers". The only difference is that it is much harder to make sense of the code which is protected by a VM, even if it's "in plain sight".
– newgreApr 2 '13 at 22:06

@newgre: not all of them will unpack everything at once, though. So you may end up with bits and pieces only.
– 0xC0000022L♦Apr 3 '13 at 17:31

newgre can you explain how step 3 of Igor's process is possible when the only thing in memory is bytecode for a randomized instruction set? The only way to dump an executable protected in this way is to write a disassembler for the bytecode.
– 94c3Apr 7 '13 at 18:38