June 12, 2011

I've been busy with a new project involving data formats. It's my C++ training. In the past, I developed a lot in C, was working on file format parser modules. Now there is something new here. Firstly, it's C++. Secondly, it's about data formats, and data structures. Although it will contain parser modules this is not about parsers.

Let's assume a crime. There is some evidence that is a file fragment. You might think it's hard to imagine but let's just think the file fragment is recovered from a corrupted disk. You need to extract evidence from that file. You need to make the best explanation what is the file for. Where is it from? Who or what did produce it? Maybe the meaning of that file depends on a proprietary application but that is not available and you need to extract as many information as possible to draw conclusion. How would you analyze that binary fragment? You might open it in your hex editor, launch entropy analysis on it, extract the printable strings, look for padding bytes, for possible header fields, for code of IA32 or ARM, or for constant values. That's very nice but doing everything manually it's a bit time consuming, and the murder is still out.

This tool is about analyzing binary objects involving unknown ones, discovering the structure of the object. The forensic example mentioned above is just one; I could have said how important to discover the underlying structure of objects for targeted fuzzing, too.