MetaData: I’ll Bet You Thought That Was Private?

Metadata (meta data, or sometimes metainformation) is "data about data", of any sort in any media.

So I hear you thinking: who cares? Well, for starters: you should.

MetaData contains a lot more information than "data about data". Documents such as .PDF, .DOC, .XLS, .PPT, ... contain information such as

Revision history of files (in case of Word documents)

Usernames of the person creating/editing the file

Paths to where the file was/is located

Software version used (Word 5.0, Word 10.0, ...)

Public network shares

...

If you're still saying "so what?", ask yourself the following question: should this data really be public? Should everyone really know my username to my computer? Or everyone who contributed to a certain file? Or where I saved it, and what software I used?

If I were a malicious person, I could use that information for a targetted attack: I can send you a phishing e-mail, with the name of some of your colleagues in it, or one of those names as the FROM-address, so it looks legitimate. I could use that software version number to attach a very specific software exploit, so I can gain control over your system. I can use your username to brute-force your password.

See a trend there? The MetaData is giving out a lot of info that can be abused, and there are plenty of ways to get it. Consider our good friend Google for a second, they have some very nifty filters you can use in order to search efficiently. Ever searched for the string "site:microsoft.com filetype:doc"? It gives you a list of all .DOC files, found on the microsoft.com site.

By using publicly available information, I can get enough information to get an idea of the internal layout of a company. And I haven't even set foot inside it yet. Tools such as Metagoofil simplify the act of getting this information, by searching Google for you -- and extracting the metadata.