If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register or Login
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

.NET Framework IL: What is Obfuscation?

Q: What is Obfuscation?

A: When a well-written obfuscator tool goes to work on readable program instructions, a likely side effect is that the output will not only confuse a human interpreter, it will break a decompiler. While the forward (executable) logic has been preserved, the reverse semantics have been rendered non-deterministic. As a result, any attempt to reverse-engineer the instructions to a “programming dialect” like C# or VB will likely fail because the translation is ambiguous. Deep obfuscation creates a myriad of decompilation possibilities, some of which might produce incorrect logic if recompiled. The decompiler, as a computing machine, has no way of knowing which of the possibilities could be recompiled with valid semantics. Humans write and employ decompilers to automate decompilation algorithms that are too challenging for the mind to follow. It is safe to say that any obfuscator that confuses a decompiler will pose even more of a deterrent to a less-capable human attempting the same undertaking.

Primitive obfuscators essentially rename identifiers found in the code to something that is unreadable. They may use hashing techniques or arithmetically offset the character set to unreadable or unprintable characters. While superficially effective, it is obvious that these are reversible techniques, and as such, are hardly protective. PreEmptive’s obfuscation tools go far beyond this primitive renaming approach with additional ingenious ways of “creating confusion” that make it nearly impossible (and certainly not worth the effort) to reverse-engineer someone else’s intellectual property.

Why do I need obfuscation?

The .NET platform realizes Microsoft’s vision for the next paradigm in Windows computing: multiple programming languages interacting harmoniously, sharing an enriched object-based framework, contained within a common runtime engine, running using just-in-time compilation. While not exactly the Java platform concept, it is obvious that the .NET architecture shares some common ground.
One concept that Java and .NET mutually share is the use of expressive file syntax for delivery of executable code: bytecode in the case of Java, MSIL (Microsoft Intermediate Language) for .NET. Being much higher-level than binary machine code, the intermediate files are laden with identifiers and algorithms that are immediately observable and ultimately understandable. .NET ups the ante by including readable metadata that explains the intended runtime behavior of the file. Add the mechanized assistance of decompilers and you have a situation that clearly exposes intellectual property to compromise and threatens security breaches.
Organizations concerned with their intellectual property need to take a hard look at this issue when considering the .NET platform. Obfuscation is a technique that provides for seamless renaming of symbols in your assemblies as well as other tricks to foil decompilers. Properly applied obfuscation can increase the protection against decompilation by many orders of magnitude, while leaving the application intact.

You probably noticed about the example is that the obfuscated code is more compact. A positive side effect of renaming is size reduction. For example, if you have a name that is 20 characters long, renaming it to a() saves a lot of space (specifically 19 characters). This also saves space by conserving string heap entries. Renaming everything to “a” means that “a” is stored only once, and each method or field renamed to “a” can point to it. Overload Induction enhances this effect because the shortest identifiers are continually reused. Typically, an Overload Induced project will have up to 70% of the methods renamed to a().