Pages

Sunday, January 13, 2013

Backtrack Forensics: PDF analysis with peepdf

This is far the best PDF analysis tool I saw for Linux, it's an all-in-one utility to analyze PDF file. It will show all the objects and elements in the PDF, supports most of the encoding and filters, and can parse the PDF file. If you install Spidermonkey and Libemu it provides Javascript and shellcode analysis too. You can do string searches see the physical structure (offsets) of the file, the logical tree, changelog, object streams. It also offers basic PDF creation, filter and object modification, string and name obfuscation, and many more. Again this is a great tool.

Usage:

It can be used in two modes, interactive and non-interactive, the first one is the powerful part. Before starting, it's worth to run an update:

./peepdf.py -u

The we can start the analysis. The most basic mode will display summary information of the PDF: number of objects, suspicious elements, if there are JavaScripts, where they are, MD5 hash, etc...

./peepdf.py /root/forensics/pdf/msf.pdf

I'm using the malicious PDF created before with metasploit, it utilizes a getIcon vulnerability, and added a reverse_tcp meterpreter as the payload. We can see that it found right away that the PDF probably malicious and also provided the CVE number. We can go to the interactive mode to discover more:

./peepdf.py -i /root/forensics/pdf/msf.pdf

It will display us the same information again, but with more colors, and we will get a PPDF> prompt, from where we can run the commands. Let's see couple of the options:

PPDF> help - will display the available commands

PPDF> info - displays the same basic information what we saw before

PPDF> metadata - displays metadata information of the file

PPDF> changelog - shows the changes of the PDF

PPDF> tree - displays the logical structure of the file

PPDF> offsets - displays the logical structure of the file

PPDF> object [ID] - shows the decoded content of an object

PPDF> rawobject [ID] - shows the raw content of an object

PPDF> stream [ID] - shows the decoded content of a stream

PPDF> rawstream [ID] - shows the raw content of a stream

PPDF> search [string] - searches for a string in the PDF and shows in which module it can be found

PPDF> references to|in [ID] - shows to which the references to or from the object

PPDF> js_analyze [ID] - analyzes JS in the object if we have spidermonkey installed