Saturday 1 November 2008

Quickpost: Fingerprinting PDF Files

Per request, a more detailed post on how I use my pdf-parser stats option.

I have two malicious PDF files with a different title, different size (100K and 700K) and different content. But they share an identical internal PDF structure, because they have exactly the same number and type of fundamental elements:

These statistics were generated with the following command:

pdf-parser.py --stats malware.pdf

As both malicious PDF files produce identical stats (or fingerprint), I can assume they share the same origin.