Saturday, June 19, 2010

Data recovery tutorial using Ubuntu Linux

I have an old hard disk with corrupt NTFS volumes and I want to recover the data. I'm not sure how the drives got corrupt but they cannot be fixed by chkdsk /f. Fortunately, there's a plethora of open source data recovery tools available. Two such tools are foremost and photorec which specialize in combing through hard drive partitions to recover files based on header information. They can even recover files after accidental reformatting.

Foremost is a tool originally developed by the U.S. Air Force and is available via sudo apt-get install foremost. It can recover common file types such as txt, jpg, avi, andetc.. Foremost was last updated in 2008 which means that its knowledge of file headers is, at best, two years dated.

Photorec is part of the testdisk suite and is available via sudo apt-get install testdisk. Testdisk is a tool that not only "tests your disk" but also rebuilds your partition table. This is the tool to use if your hard drive's master boot record or partition table is corrupt. Photorec, like testdisk, is a poorly named command-line tool. Like foremost, photorec recovers files (not just photos) based on file headers. In fact, photorec supports more file types and is more up-to-date than foremost, which is evident by the fact that I was able to recover more files with photorec than foremost.

The problem with both foremost and photorec is that they recover file content but not file names. So you end up with directories of randomly named files with only the file extension preserved. It's not ideal but it's still better than not having the data at all.

Ego (aka About Me)

I am a software developer with about 6 years of experience in various roles such as a university researcher, independent consultant, and enterprise developer.

My criteria for good software code can be reduced to a single axiom: Good code is easily testable code. This is based on my belief that software code are ultimately theories of how a business process should work. And it is well-accepted in the scientific community that good theories are testable theories. So, good code is testable code.

I hate the word "enterprise" even though I use it quite a bit.

This blog is written for the sole purpose of shameless self-promotion. :)