Saturday, October 29, 2011

Protecting Code

As the world is shifting from compiled languages such
as C, C++ and Pascal to scripting languages such Python, Perl, PHP
and Javascript, so does the growth in exposure of intellectual
property (the source code). While previously “fat clients”
usually written in C and C++ were a compiled machine code
executables, more modern applications written in .NET and Java
consist of bytecode which is a “is the intermediate representation
of Java programs” (Petter Haggar, 2001). The same is applicable to
.NET applications which could be disassembled using tools shipped
with the .NET Framework SDK (such as ILDASM) and decompiled back into
source code (Gabriel Torok and Bill Leach, 2003). With web
technologies such as HTML, Javascript and Cascading Style Sheets
(CSS) where the source has to be downloaded to the client side in
order to be executed by the web browser, the end user has
unrestricted access to the entire source code.

Ability to access source code can be used both for
legitimate and malicious intent. For example, security tools are
using the ability to decompile Java applets and Flash to “performs
static analysis to understand their behaviours” (Telecomworldwire,
2009). Moreover, the ability to disassemble the source code can be
used by the software developers for debugging. On the other hand, it
can also be used to reverse engineer the source code which directly
impact the ability to protect the intellectual property.

One obvious way to try to protect the source code,
thus the intellectual property it carries, is to use obfuscation
(Gabriel Torok and Bill Leach, 2003)(Peter Haggar, 2001)(Tony Patton,
2008). Regardless of the language used to the develop the
application, obfuscation usually means:

replacement of variable names to non-meaningful
character streams

replacement of constants with expressions

replacement of decimal values with hexadecimal,
octal and binary representation

addition of dummy functions and loops

removal of comments

concatenating all lines in the source code

In a way, the process of obfuscation changes the
source code to make it difficult for the “reader” to understand
the logic behind it. It (obfuscation) could be seen as “your kid
sister encryption” - “cryptography that will stop your kid sister
from reading your files” (Bruce Shneier, 1996). Of course,
persistent “reader” can invest enough time and resources to
reproduce the source code (deobfuscate) by applying obfuscation
principals in reverse.