Binary Literacy: Static Reverse Engineering

As the title implies, this course is about analyzing software systems without executing them, as though one was reading a novel. Starting from the basic letters (assembly language instructions), words (basic blocks) are constructed; from there sentences (functions) may be put together. These are organized into paragraphs (modules) which, taken together, form the bulk of chapters (executable objects). Finally, a collection of chapters makes up a book (software system).

The course begins by systematically examining the process of compiling C code into assembly language, and how to manually decompile assembly language back into C. All of these examples come from real-world binaries. Prior experience teaching this course shows that this gives students a good grounding in reading assembly language.

Understanding the structure of a sentence is not enough to understand its actual meaning, or that understanding one sentence is not enough to understand a paragraph, etc. Decompilation is therefore not enough: the human analyst needs techniques to comprehend the code that he or she is seeing. We will thus proceed with techniques to derive semantic meaning from assembly code.

With the above in hand, we are prepared to statically analyze any C-compiled binary of our choosing, and we shall spend the rest of the class reverse engineering binaries both in live and individual sessions. These binaries will consist of live malware, but it needs to be stressed that this is not a course on malware specifically: it is a course on reverse engineering in general, and its techniques are applicable to all sub-fields thereof (e.g. malware, security, interoperability).