Monday, February 24, 2014

Recover from a bad superblock

When things go really bad, you may not be able to recover a disk. In those times, think of salvaging the data, reformat and live to fight another day. Consider how valuable the data is versus the time spent on repairing something that is damaged and may not be salvagable. testdisk photorec ddrescue are the tools to think of when you come that decision
But I do enjoy a challenge and when a USB disk was brought to me with mounting problems, I just couldn't pass it up. It was an uncommon setup. The USB stick had two partitions, one with an ext3 filesystem and the other with FAT32. I decided to focus on the ext3 filesystem first.

To cut a long story short, my efforts to mount the disk met with screens full of error messages and cryptic clues as to what went wrong. Running fsck seemed to clean it first but it still would not mount the partition. Running fsck again would yield more and a different set of errors. My previous boss love to used the expression "time to decide: Fish or cut bait". It was one of those times.
This is probably the last ditch effect before you make that fateful decision. This is the line in the sand and the one you have to cross before deciding to put your effort in getting the data out and start all over again.
The recovery process involves rewriting the information about the partition. Specifically, reinitializing the superblock and group descriptors. However, reinitializing does not touch the data part of the partition. It does not touch things like the inodes and the blocks themselves. So by starting out with a 'fresh' set of information that is used to mount the disk, there is a possibility that the data may still be readable. After that, the data part gets checked and hopefully what you end up with is a filesystem that can be mounted properly.
The process can only be done when the partition is not mounted. If you have tried other ways, it most probably isn't. Mine wasn't, obviously.
So here's the process.
1. First, figure out the block size of the USB drive (in this case /dev/sdf1). I need that information to re-build the partition information. Run the commanddumpe2fs /dev/sdaf1 | grep 'block size' -iBlock size: 4096

2. Then format the superblocks. The command below won't format the whole partition, only the superblocks. It is critical that you use the correct block size gathered from the previous stepmke2fs -S -b 4096 -v /dev/sdf1

3. Now that the partition information is 'fresh', I checked the inodes to figure out what else could be wrong with the filesystem. Remember ext3 = ext2+journalling. So, ext2 tools still worke2fsck -y -f -v -C 0 /dev/sdf1

4. Now that I've done with one element of the ext3 equation, it time to fix the journalling system or more specifically the journal data .tune2fs -j /dev/sdf1

5. Re-attempt to mount the partition. If everything went well, you should be able to mount the partition and read the data.

After that, for hard disks, you have to determine whether the disk has reached it's threshold limits. Things like SMART properties will help you get that information.