How a Corrupted USB Drive Was Saved by GNU/Linux

When I actually started looking, however, I wasn't really sure
if this was a FAT16 vs FAT12. The drive's capacity of 512MB
suggested it could be either FAT16 or FAT32. I also somehow had the
impression that the partition could have contained a FAT32 filesystem
in the same partition type. As I continued to look through the
filesystem, I noticed this:

On a side note, I recently discovered the hard way that CMD |
less doesn't do what you want it to if the output of CMD is too long.
In this case it was okay to use, but it isn't always; this probably is
system-dependent. If you have enough space on your hard drive, it may pay to do something
like this:

# od -Ax -w8 -tx1 -tc /tmp/r1 > /tmp/r2; less r2

or

# hexdump -C /tmp/r1 > /tmp/r2; less r2

So this looks like the start of a directory. Immediately above
that area, though, I saw this:

That looked like an allocation chain with 16-bit entries.
If these had taken the form 31 dd 00 00 32 dd 00
00 rather than 31 dd 32 dd, I
might have thought I was looking at FAT32.

I had heard somewhere that typically two FATs can be found together,
one right after the other. I told less(1) to find another line
resembling the line at 0x42460, by typing ?31 dd 32 dd 33
dd. In response, less(1) showed me this:

Luckily, this looked okay too. In fact, FAT#2 might be completely okay
even though the first 40KB or so of FAT#1 had been corrupted.

Repair Attempt #1

All of this has been interesting, but the point of this exercise
was to repair the filesystem and read the data. So I now turned to my
friend fsck for the repair work, in particular fsck.msdos, err and dosfsck(8).
I took the filesystem image and did what needed to be done with a spare
loop device:

# losetup /dev/loop2 /tmp/r1
# fsck.msdos /dev/loop2

But according to fsck.msdos(8), the "disk" claimed to have something
near 165 FATs, whereas fsck.msdos only supports two. Apparently, some
filesystem parameters were messed up severely.

Comment viewing options

It was an enlightening article for me. However, I have a slightly different problem. I have a USB disk (1GB) which when I fsck.vfat - shows the following:
dosfsck 3.0.3, 18 May 2009, FAT32, LFN
/dev/sdb1: 379 files, 55267/62958 clusters

I cannot see the files on the drive. Moreover, when I do:
$sudo sfdisk -l -f /dev/sdb1

I was having issues with a flash drive not being detected just now. Wouldn't mount and fsck.msdos gave a funny error. following your steps to view the partition table and with a bit of help from google, the active flag was set to 81H. So I used fdisk to toggle it and now it works. :)

I brought a 49GB ameco flash drive. after using of 2 months, by mistake I formated it in windows using NTFS. Then it goes to RAW file system. It is showing in windows but for formating is not supporting in FAT/FAT32. Then i tryied in Linux, but the drive is not showing. One of my friend said that, it having problem with '0" sector problem. what can I do?

This makes sense. I once had a friend who had a Microsoft Word document on his flash drive which was corrupted on Windows. I stuck the flash drive in my laptop run Ubuntu v 6.06 (Freeware/shareware Linux) and opened the file up without any problems whatsoever. It just worked.

The many FF FF entries in the beginnning of the FAT were probably correct:

In FAT systems, unfragmenting utilities usually deploy directories at the beginning of the partition. If there are not too many entries in a directory, the 32-bit entries in the directories all fit in one block (e.g. in a system with 8k-blocks, there are 256 entries per block), so the directory "file" has the lenght of one block. It is marked in the fat with FFF (FAT12), FF FF (FAT16) or FF FF FF FF (FAT32).

So I would expect to see many FF bytes in the start of a FAT. Did you ever look whether the FAT contained FF bytes at the beginning or not?

The boot sector of a fat disk partition (not the MBR) contains drive parameters , if the FAT boot sector is damaged, you will have some problems. I suppose windows damaged that sector due to some bug, not the FAT. Writing there, e.g the information that there are about 165 fats ;-)

The disk parameters consist of head/cylinder/sector information (which can safely be ignored (and is ignored, I guess, even by Windows) since it is purely redundant, the relevant data being in the MBR, plus some routine information (e.g. bytes per sector, which is virtually alway 512), plus vital information as number reserved sectors at the start of the partition, which has to be known to calculate the start of the first FAT. Whether the information about the number of FATs is vital depends on the probability that someone created a FAT system with only one FAT ...

Collin - Thank you for posting this article! Yesterday, I accidently shut down my Windows XP computer without ejecting my Lexar Jumpdrive. Afterward, I got the "Drive not Formatted" error. Fortunately I found your article, and have a second computer running Redhat. Following essentially the same precedure you described, I was able to recover all the data from my disabled Jumpdrive. You saved the day. Thanks again!!

Hey! i just happened to jump to this page, nice comments here. my problem is that this very incidence happened with me 2 weeks back. my 512 MB kingston usb disk got corrupted. i have tried numerous programs to read it. it is shown in windows xp as removabe drive but it cannot be formatted. the system is shown as raw. some hard disk tools do show the exact size of the disk but can anyone tell me how to get the disk back! i will appreciate the replies to my email at sash@highnoon.com.pk

I totally agree i knew i shouldn't have wasted my dollars on this crap software it completely fucked up my drive and none of the files it recovered actually worked!!!!!!!! useless piece of crap software i am demanding a refund but these fuckers will probably get away with it!!!

The information here was invaluable, because it gave me some encouragement in trying to recover the 250GB USB drive that Win2K suddenly didnt want to accept after I missed that a VMWare Virtual machine had mounted the drive...

I lost the partition table and have no disk large enough to create an image to.

With the above info I started dumping 50MB of the disk and found that no physical damage was evident, but still no way of finding the appropriate data for the approach above.
Knowing that it was a single 250GB FAT32 partition I downloaded GParthttp://www.stud.uni-hannover.de/user/76201/gpart/

It guessed what I believe correctly and after reboot the disk was mountable (in fact it automounted which maybe wasnt the best approach). I ran fsck.vfat with no reported errors!

Once again, I had great help of the info you supplied and hope that more people in desperate need will find your article.

Where do you find this info? This is way beyond my meager knowledge. Any references would be *really* useful.

Sorry, I don't remember. But I remember reading somewhere that what FATxx means is is that the cluster numbers take xx bits. Hence seeing two consecutive 16-bit numbers (31 dd 32 dd) gave me the clue that it was FAT16. Unless a FAT filesystem is severely fragmented, one would expect to see consecutive cluster numbers either in a file allocation chain (cluster numbers within a file) or the free list.

NOTE: I haven't verified the full technical accuracy of the following links, your mileage may vary, not a statement of my employer, don't try this at home, etc etc etc., but these are just links that appear useful, just based on a casual glance:

The card is now mountable, I see all directories and all files, *g*
but only files smaller then 0x4000Bit (16384) can be read without
read/write errors. dosfsck will truncate all files to <=16384 and
is no help. Is this 0x4000 limit based on cluster - that there is
information to jump to the next cluster missing?
With fsck I got also the errormessage that multiple files or
directories use the same cluster.

ls -lRA is working, but some files and dirs have the date of the
crash day - but they are older and not written on that day.

So "ls -lRA" is not the final "Acid Test" for the recovery, a "cp -Rp *"
would not work in my case.

-What can I do insted of msfsck?
-When I something with "ls -lRA" the FAT1/FAT2 are still there?
-How can I extract the starting adresses for each file I see
with ls -lRA?

I know that there are tools like autopsy, but I hope that
the FATs are still recoverable.

dosfsck will truncate all files to <=16384 and
is no help. Is this 0x4000 limit based on cluster - that there is
information to jump to the next cluster missing?
With fsck I got also the errormessage that multiple files or
directories use the same cluster.

ls -lRA is working, but some files and dirs have the date of the
crash day - but they are older and not written on that day.

So "ls -lRA" is not the final "Acid Test" for the recovery, a "cp -Rp *"
would not work in my case.

-What can I do insted of msfsck?
-When I something with "ls -lRA" the FAT1/FAT2 are still there?
-How can I extract the starting adresses for each file I see
with ls -lRA?

I know that there are tools like autopsy, but I hope that
the FATs are still recoverable.

read/write errors. dosfsck will truncate all files to smaller 16384 and
is no help. Is this 0x4000 limit based on cluster - that there is
information to jump to the next cluster missing?
With fsck I got also the errormessage that multiple files or
directories use the same cluster.

ls -lRA is working, but some files and dirs have the date of the
crash day - but they are older and not written on that day.

So "ls -lRA" is not the final "Acid Test" for the recovery, a "cp -Rp *"
would not work in my case.

-What can I do insted of msfsck?
-When I something with "ls -lRA" the FAT1/FAT2 are still there?
-How can I extract the starting adresses for each file I see
with ls -lRA?

I know that there are tools like autopsy, but I hope that
the FATs are still recoverable.

From what I understood, you've overwritten both FATs with "clean" ones? Bad move, sorry..
FAT = File Allocation Table .. basically, holds the information of where different parts of files are on a disk.

The root directory (which is located right after the FATs), however, holds some other info: file names, attributes, sizes, dates, and starting clusters' numbers, that's why you can see the files and can access them, but only the first 16kB, since that is the size of a cluster on that particular disk..

This piece is confusing: "According to Lexar tech support, there is a bug with Windows 2000 (that MS never bothered to fix) and can corrupt the drive when it is removed without proper eject.".

Does this mean that the drive was damaged because of this bug (and you could say good bye my $70 USB), or was it the filesystem on this drive (so that reformatting that USB would make it usable again)?

There are various products from Image Rescue to Data Rescue, they are not open source but there are trial versions available for download which will allow you to recover a file at a time. Of course all of this is for the mac only. They are famous for their official Novell client for Mac as well.

Nice article, printed and saved for if I ever need to do something similar. A very quick point though - beware of mounting under /tmp on systems that aren't yours; some badly written tmp cleanup scripts can go in and clean up your newly mounted filesystem :-( I've lost filesystems to this a couple of time before I worked out what was going on...

I approached a similar problem I had differently - I knew I was looking for JPEG images and that there was a chance they were intact. I assumed the FAT was unreliable, and admittedly forgot about the second FAT copy.

I also assumed there was no fragmentation, and then proceeded to 'dd' anything that looked like an EXIF header +1.5MB to disk as individual files.

I wish I had known about this about a month ago. My Father had a 128MB Sandisk USB flash disk which he had been using under Windows 2000 and it died in a rather spectacular way, the only difference here though is that when I fed it to my linux workstation, it did not find any partition records - it was royally goosed. I had an identical USB flash disk to my Father's and tried to overwrite the partition records from my chip to his, but with no success. Extremely interesting article though and very well researched. Congratulations on recovering your data - I challenge any Windows user to even come close to that without having to write a suite of programs.

I've done this type of recovery myself, using basically the same approach: look at hex dumps and compare the corrupted media with uncorrupted media, looking for areas that match, then copying individual sectors in an attempt to produce something readable. I wouldn't have thought of using the backup FAT table for fsck.msdos, though. :) Usually, fixing a partition table would do it. Or I would've copied FAT#2 on top of FAT#1 and just tried to mount it...

Overwriting FAT#1 with FAT#2 (or vice-versa) - this isn't necessarilly a good idea if there's a chance that both are more or less corrupted.
In such cases, trying to "merge" them (with extreme care, of course) would be better..
Well, just using the copy #1 at one time, and #2 at the other, while both times trying to access & restore files and see which time you get better results ("proper" file contents), would be the most cautious way.

Thanks for posting your comment! Those are both good ideas, which I didn't think of. Well, I thought of trying to fix the partition table but I didn't know enough about the boot sector contents (and still don't).

The important thing is that GNU/Linux distros give us enough tools so that we can approach these problems in different ways... even problems caused by closed operating systems!

"I recovered once a whole laptop for someone in the office with half a years work of source code on it"

Can we infer from your comment that your colleague whose work you saved is a software developer who had not backed up his or her work in six months? Yeesh! I'm getting queasy just thinking about this; I need to go lie down for a while ....

My usb-disk (Trekstore, 8 GB) was damaged by unpluging during a write-operation.
I tried some ways, mentioned above. Nothing worked.
Finally I found Testdisk and it worked greatly!
Many thanks for that advice!

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.