Posted
by
timothyon Tuesday September 20, 2011 @01:12PM
from the all-your-priceless-ascii-art dept.

First time accepted submitter Zilog writes "The end of the 3.5-inch floppy and the disappearance of associated drives showed to me the need to backup the tens of diskettes that accompanied my youth. Carefully preserved, these diskettes have proved readable for the most part — while some are approaching 20 years old. However, some diskettes have shown surface defects in areas with compressed archives (zip). Any ideas on how to recover (as much as possible) these bad sectors?"

I like to use ddrescue [gnu.org], not to be confused with dd_rescue - which is not as powerful. Note that if you use Debian, they have the names alllllllllll sorts of screwed up - search for gddrescue. Same with Ubuntu.

And it's not a non-destructive method, so when it doesn't work, you've altered the media such that future recovery is more difficult. Also, the way it's described as operating is technically impossible, so the whole thing is voodoo magic.

It seeks to the bad sectors from every track on the disk, hoping to get tiny differences in the position of the head relative to the track when it gets there. It does this several times and votes on each byte in the sector.

I'm not sure that seeking from more than three or four tracks away would make much difference on a floppy but the theory is sound. Maybe you could vibrate the head between two tracks for a bit instead of doing long seeks.

I think to get it to do a good job on floppies, you need version 4 or 5. Those still talk directly to the floppy controller. The recovery is definitely non-destructive. It simply keeps reading the same sector after approaching it by seeking from different sides IIRC. The data that's read is passed through a statistical model that tries to predict the original values of bytes. It does work sometimes, and would be worth a try. Versions 6 and onwards don't talk directly to hardware and are pretty useless methi

That's fine -- AFTER it recovers. There's nothing wrong with that. A successful recovery means that the CRC had matched. Of course you can get a match on wrong data, but there's a way of running spinrite without having it write anything. It'll simply report what was readable/recoverable and what wasn't.

No, that is not fine. First off, if the disk is damaged, why would you trust a different spot? If the sector spinrite writes to is also bad you have recovered nothing. If the disk is really bad there may not be any sectors that are safe to write to.

Second, if the filesystem is damaged or spinrite doesn't understand it (ie, anything other than FAT), then it has no idea of what sectors contain useful data and which are free. Spinrite can very easily write over other data that you want to recover.

Which effectively means that spinrite is incapable of any recovery in that situation. If it can't relocate data it has no ability to put it somewhere else. You trust it to put the recovered data back in the SAME bad sector? Spinrite can't fix bad sectors, all that magnetic crap it claims to be able to do is just that, crap. Seriously, run it against a USB flash drive sometime and watch all the BS magic magnetic info it gives you.

v6 uses BIOS and only bios. v5 uses BIOS only if it can't talk to something via IDE/ATAPI or through the floppy controller. v5 does recognize a limited number of hard drives and reconfigures them into a "diagnostic" mode where built-in data recovery is disabled, and the software has better access to raw sector data.

Also he should use dd or something similar to take a raw image of the disk as a first step. You've got a gazillion times as much space now, so store an image in case the disk gets damaged in the recovery process - plus you can keep that image if you want to try the destructive recovery process.

The read errors are statistical phenomena, and it also depends on where which direction the head is moving*, so there's no point using dd. dd won't preserve the original's physical features, so a lot of (ambigous) data is lost.

* I'm not a recovery expert, but that's what I learned from reading other comments.

Depends. If the error is caused by some inaccuracy in the disk reading process, then a dd image won't be as useful as the original. If the error is caused by a problem with the disk itself, then a dd image will be as good as the original for any purely software-based recovery purposes.

I second the suggestion for spinrite. It worked wonders for me back then.
The biggest problem you have is the compression. Since the compression algorithm depends on previous data in the file, once you get to an unreadable sector everything after is lost. For some algorithms is any part is missing, all is lost.
You might be able to read the raw disk sectors under linux. I remember doing something like that long ago but don't remember how to do it anymore. Reading it under linux has the benefit of not

That would be incorrect, the magnetic bits on the disk can often be "weak" enough that it can't quite be read, but if you retry reading it often enough it may actually register correctly. Also, since the disk track may not be directly under the head, forcing the drive to change tracks and back can quite often put the bits that are weak to be directly under the head causing them to give just slightly stronger (correct) reading.

Actually, it will. I've read data off of hundreds of old 3.5" floppies over the years. Using recovery programs like rescuedisk from FreeBSD or ddrescue I've found maybe two dozen of those I was actually able to read the data with enough retries, on the order of 1000.

A couple I've not been successful with, but I've been able to read the troublesome sectors if I try reading it on other drives enough times.

Maybe I've been lucky. I used to believe that if you couldn't read the media after 10 retries just giv

This approach also works with floppies. You have to know what you are doing though. I looked into this 20 years back. The S/N ratio on floppies is pretty high, so even a severely degraded signal may be recoverable this way.

Side note: User 460244 is talking gibberish and does not understand the problem. The magnetization areas in floppies are way above the flux areal size limits. You just read the analog signal from the heads.

If you can find one use a Superdisk [wikipedia.org] to read a floppy. The heads are much more sensitive and narrow and can read ordinary floppies better than a regular floppy drive. I have used this to recover data from floppy disks that were old/worn.

About a year ago a friend gave me a floppy disk and asked if I could get the data off of it with a floppy drive I had laying around. I tried the obvious approach: drag the files off using whatever file browser I was using. This failed because of at least one bad sector, and so I lost one file of about seven.

I attempted to work around this by writing my own file copier that attempted to read the file in question in byte segments. This was not effective (though it narrowed down the bad bytes), nor was it acce

There's no point in reading a disk byte by byte, as the disk is read by sectors, and the read errors you're getting are from the CRC mismatch in the sector you've read. Floppy sectors are usually 512 bytes, but could be something else for weird formats like 2M (why do I still remember this stuff?)

Sometimes it helps to intercalate reads of sectors other than the one you're trying to read, in order to make the head move. That can help with reading bad sectors as disk heads have some positioning imprecision, s

I it possible to coax it into returning the data it thinks is bad? I'm wondering if you could re-read the sector repeatedly, recording the results to separate files (or however you want to store them) then looking to either test the file with the different segments, or perhaps figure out the parts that are changing only and do the same?

I know this has been discussed before, but it really begs the question of how to preserve digital data for long periods of time. Stone tablets last for thousands of years; paper for hundreds (or more, if in climate-controlled storage). What have we got for (large amounts of) digital data?

I know this has been discussed before, but it really begs the question of how to preserve digital data for long periods of time. Stone tablets last for thousands of years; paper for hundreds (or more, if in climate-controlled storage). What have we got for (large amounts of) digital data?

Don't think you have any idea what "begging the question" means other than improperly using it as verbal filling material. Sorry, nothing personal, just had to be said.

Boring monthly / weekly/. topic. Short answer is to copy it to new media yearly and keep all the old copies in storage as "backups of backups", and at that annual copy time, verify the contents of the backups if you can.

Stone tablets do NOT last thousands of years... its just the tiny fraction that survived happen to be that old. Ditto th

Give them to my ex-wife, and tell her that they somehow say that I said it makes her look fat. NOONE in the world will ever ever forget the contents of those disks ever for as long as the human race survives.

It depends on how long you think the data needs to be preserved. Archival DVDs are pretty robust, but you might have other ideas about the best optical storage medium. And if you want it to survive natural and man-made disasters, you should make several copies in different physical locations, preferably on different continents and in different climates. If it's really important, build a few pyramids, drill some very deep holes, and get a couple of copies up on the Moon. Aside from that, periodic integrity c

Clean and align the drive first, before you screw up any (more) disks.

To give an analogy that kids now a days can easily understand, its like trying to insert an old fashioned flash drive into a USB port full of peanut butter. It might work, it might even work most of the time, but it'll work better if its clean.

Due to the digital capture effect or whatever, you might only need one dB more signal or one dB less noise to go from a sector having a read error somewhere every time you read it, to having an errorless read.

If you have way more time and/or money than you know what to do with, you break out the oscilloscope and align the drive to that individual disk/track. Yes this takes a lot of time and gear, but if you really gotta do it... Basically you align to best SNR on that individual disk rather than to an alignment disk. If the drive that wrote the disk was technically out of alignment, this will save you. If the drive that originally wrote the disk was in perfect alignment, then this is a waste of time.

At the very least, clean the freaking drive. Using kimwipes and undenatured pure ethanol on the heads. You drink some ethanol as a toast to the computing gods after success or failure, doesn't matter which, either way you're doing a shot or its bad luck and the next disk will shed its oxide for sure. Everclear is supposedly pure enough to clean drive heads, and supposedly its drinkable. All I remember from my only experience with everclear was yelling some lines from a cartoon and throwing up, and there are disks I have not been able to recover, so take my cleaning solvent suggestion with a grain of salt. Kimwipes are hard to explain and may no longer be manufactured, but they used to be like a dustless, lintless fabric q-tip, at least in concept, sorta. I don't mean they were like a q-tip in that they were of a certain dia, length, and color, but more the general idea of a perfect cleaning fabric at the end of a non-conductive stick.

There are various data recover apps out there for this purpose. The problem is complicated by files being zipped. it is possible to dump the files as binary images then manually edit the binaries of the individual files in a hex editor etc. so that the can be read by software that works with a given format. However, I'm not sure if or how well that will worked with compressed files. Are they encrypted as well?The problems are not 'unsolvable' but can very quickly move into the realm of needing a superco

Modern floppies were made much more cheaply. I have a 20yr old computer that still boots from 5.25" floppies, and it works fine. I also had 3.5" floppies in the 2003-2005 range that lost all their data if you looked at them funny.

I have C64 Floppies that are 30+ years old that are still good. Modern floppies suck. I remember copying files onto a 3.5" disk, turning to another computer and the disk was bad. The old 5.25" disks seem to last forever.

I know the OP is talking PC, but I know many/. users grew up on the C64, so on a similar note, I converted all mine with a product called ZoomFloppy.

One of the reasons we don't use floppies anymore is that they're inconsistent. It could be that the disks have gone bad, or sections of them have, or it could just be an alignment issue. Unfortunately the easiest way to fix that would be to use the original drive that wrote the disks in the first place.

I've got the X-Wing disks that I borrowed from a friend in an attempt to dump them to HDD. And I think 3 out of 5 of them have unreadable files on them.

Floppies last at least as long as cheap writable CDs, in my experience, as long as you store them in a nice metal box with a tight-fitting lid. Don't leave them in the sun, on top of a speaker magnet, or in the baby's diaper bag - treat 'em just like 1600 bpi 9-track streamers.

Commercial music CDs, though, those things seem to last forever. Or at least, I've never had one wear out unless it was physically damaged. I've got CDs from the 1980s and early 90s that play fine.

Floppies last at least as long as cheap writable CDs, in my experience, as long as you store them in a nice metal box with a tight-fitting lid. Don't leave them in the sun, on top of a speaker magnet, or in the baby's diaper bag - treat 'em just like 1600 bpi 9-track streamers.

Commercial music CDs, though, those things seem to last forever. Or at least, I've never had one wear out unless it was physically damaged. I've got CDs from the 1980s and early 90s that play fine.

"Call me precious I don't mind78s are hard to findYou just can't get the shellac since the war"--"Don't Sit on My Jimmy Shands", Richard Thompson

Floppies last at least as long as cheap writable CDs, in my experience, as long as you store them in a nice metal box with a tight-fitting lid. Don't leave them in the sun, on top of a speaker magnet, or in the baby's diaper bag - treat 'em just like 1600 bpi 9-track streamers.

Commercial music CDs, though, those things seem to last forever. Or at least, I've never had one wear out unless it was physically damaged. I've got CDs from the 1980s and early 90s that play fine.

"Call me precious I don't mind78s are hard to findYou just can't get the shellac since the war"--"Don't Sit on My Jimmy Shands", Richard Thompson

I have lost commercial game CD's to corrosion though.

A game CD from around 2002 or 2003 is almost black on the outer half, and unreadable

The music CDs seem to last forever only because you don't mind bits being wrong here and there, and a lot of music players are designed to compensate for it. If you wanted a bit-perfect copy of those CDs you'd have the same problems.

Write a copy tool that fills in all possible bit combinations for the bad sectors and spits out 100s of zip files instead of just 1. At a max of 1.44MB/zip file, it still shouldn't be much space in modern terms. Then just try to decompress them all and see what the results are.

The good news is that as I recall, floppy disk records have a CRC appended. The bad news is that my sometimes faulty memory tells me that MSDOS floppy disk drivers used the CRC to correct reads and didn't tell the user that the record did not read properly. I think that means that any record reported as being in error has at least two errors. But maybe I'm wrong.

Sometimes using a different drive helps.

I do seem to recall that it was sometimes possible to read a faulty record multiple times and patch the

You might have some luck if you try the disks in one or more different drives. The head alignment and other small factors like that are unique to each drive. Usually they are close enough that a freshly formatted disk will work in most other drives. But when there is a small defect in how it was recorded to the disk (power surge, controller glitch, etc) or a small media defect, then a different drive may have better (or worse) luck. I've seen my fair share of disks that will read fine in one drive but not a

See this howto [wendycarlos.com] at Wendy Carlos' blog. She recovered the original Tron soundtrack this way.

Magnetic media like tapes and floppies use a binder (glue) that becomes corrupted with moisture over time, allowing the metal-oxide particles to flake off. Dehydrating the media can reverse the condition if you haven't already tried to access it.

These are very likely not surface defects, just lost magnetization. There is not a lot you can to, besides reading these areas with forensic disk reader hardware. This requires reading the analog signal from the surface and then reconstructing the signal using digital signal processing. Depending on the floppy, you may need a sampling rate of up to 8MHz and should use 8 bit or larger resolution. A low-noise fast preamp may also be needed.

I do not know any source of these. Back in the day I did enough invest

I wonder how soon we will see the posts asking how to recover information from IDE drives - most modern motherboards lack this interface and in a few years IDE will be entirely abandoned (at least in the consumer oriented market).

IDE-to-USB should be around for a while longer, I expect. Buy a stand-alone adapter, buy a USB enclosure kit for an IDE drive, or just rip apart an old USB hard drive - the drive inside it is either IDE or SATA.

if you cant get the data off using are a standard data dumper (eg dd) you are pretty much hosed.although sometimes diferent disk drives will have the heads aligned a little differently resulting in diffeent results. if dd cant read it your done.

This DOS based utility had the ability to control every little detail of the floppy drive's mechanism, it managed to read "most" of a bad track leaving just one or few sectors out, saved many floppies with that thing back in the day.
http://en.wikipedia.org/wiki/HDCopy [wikipedia.org]

I've recovered hundreds of floppies over the years. Here's what I've done to good effect.

(1) Find a machine with a floppy drive. If this machine hasn't had its floppy used in a while, either read/write a bunch of disks, or get it cleaned/aligned. I've opted for the former with good effect, but drives are getting old enough now that the former may be increasingly necessary. For older 5.25" drives, I'd definitely try to clean the heads, but be sure to do research so you don't grind the heads away by using the wrong methods. The reason I use the read/write method of a few disks that are new is that it gives you a chance to see if the drive is working on disks that don't matter. It might also allow you to have a minor cleaning effect from this to remove oxides from accumulated sitting time, but I'm unsure if that's what's going on. I have used different drives when the first tests failed, but never paid to have the broken drives fixed. There's just too many surplus floppy drives around. It might also help to have multiple drives.

(2) I have used both ddrecover and rescuedisk. The former is a gnu thing, the latter is included with FreeBSD. Both will incrementally read the disk and optionally write out data about what's been read. Both programs try to read as much data as possible in large blocks, then switch to smaller size reads for the damaged areas to try to get as much data off as quickly as possible with as few read-head passes. Having said that, often times there's a few stubborn sectors that just need to be tried a lot. For ddrecover, you may need to crank up the retry count to 1000 or more. rescuedisk does this automatically. I've had several disks that people have sworn are totally unreadable that I've been able to recover and placed in my hand to do something with. I've been able to recover most of them by retrying between 100 and 1000 times. When that fails, and it has in maybe 2 or 3 of the hundreds of disks I've done, I've taken the log files about what had been recovered to a different machine with a different drive and tried to read the (usually 1-4) missing sectors there. This hasn't failed me yet for disks that are hard to read merely because they are "old." My experience has been more concentrated on the 3.5" floppies than the older 5.25" floppies too. Different rules may apply there.

I guess I should caveat the above advice with "for disks that are just old". Disks that have been damaged over the years, or have had magnets run over them, etc all bets are off short of "extreme" options that might not even work.

Many of these techniques also work for reading damaged audio CDs, DVDs, etc.

1) an error inside a zip file (or any compressed archive format) means that any file inside the archive that is stored on the corrupted part of the disk is corrupted. Compare this to the situation without a zip file -- any file stored on the corrupted part of the disk is corrupted.

The rest of of the files in the zip file, the ones stored on parts of the disk that aren't corrupted, are recoverable.

Now, if the table of contents of the zip file is corrupted but the data itself is OK, then you can still recover the data, but it becomes more difficult -- and you'll lose the names of the files. Compare this to the situation where the directory data for the diskette is corrupted but the rest of the disk is fine -- same thing.

The only important difference between files stored in a zip file that are corrupted and files just on the disk that are corrupted here is that if there's an error in the middle of the compressed data in the zip file, that means the file is corrupt from that point on for a file compressed in a zip archive, but that only those blocks are corrupt in the case of a file just on the disk. Does it make a difference how much of the file is corrupt? Maybe. If it's a text file that can't be recovered, yes. But if it's an executable or some data file that just can't be loaded either way -- not really.

2) the zip archive means that the data probably requires less space on the disk. It may not have even fit on the floppy at all without compression. That alone is a pretty important reason to use compression in archives. If you can cram twice as much data on a single floppy -- you could possibly store it on two floppies instead, giving you a backup in case one floppy fails.

3) being compressed means that the file took less space on the disk -- therefore the odds of one of it's blocks becoming corrupt goes down similarly. (Assuming that just a handful of blocks have become corrupt. If the whole disk goes bad, you're screwed either way. Of course, with compression, losing an entire disk means you've lost even more data. But I'm not sure using 360 KB floppies rather than 1.4 MB floppies is really an appropriate data saving measure either.)

4) compressed archives almost always have checksums of some sort which will tell you if their data is corrupted. Of course, some archive formats that don't involve compression have checksums too -- but many don't.

It's very good to be able to tell quickly and programatically if your data has been corrupted.

Yeah, I think zip and rar files are a little more flexible than, say, tar.gz, since there are still some partial recovery options you can do to extract as many files as you can. With tar.gz, you pretty much lose everything after the corrupted data.

But I suppose what subby really wants is some magical software that will read the archive checksum, then try all permutations of possible values that could fill the corrupted portion and satisfy that checksum, until he can reconstruct the original zipfile.

You could just use a block compression algorithm. Doing so, only the damaged block is lost. You can read around it. Including something approaching Reed-Solomon code means (depending on how much was damaged) you could recover the data!

5.25" disks are almost certainly still good if they've been stored properly. HD 3.5" disks have their magnetic domains packed really close together, over the years there's crosstalk and data loss. Most 3.5" disks I've tried from that era are bad. I don't think I've come across one 5.25" disk that wouldn't read.

I have a good number of 5.25" Apple DOS 3.3 & ProDOS disks (143K) from 1982 - 1988 that still work. There are some with errors, but the majority of them work fine. My Apple 3.5" disks (800K) haven't survived nearly as well.

I gave up on my collection of DOS (as in FAT) floppy disks of any variety years ago. They never seemed all that reliable even when newish.

Check out garage sales or swapmeets to pick up a vhs. That or repair your machines, even hi end ones are pretty simple to repair. I was poor growing up so every single vcr I had was someone else's trash, if a 10 year old can repair one I'm sure you can figure it out once you pop it open. Craigslist even has machines as low as $10.

Are there any pro-places that don't charge an arm and a leg to copy off VHS to DVD or bluray?

No. At least, not if they actually do it well. Recording the VHS output is the easy part, cleaning up the combing, noise, chroma nonsense, warping from tape distortion, etc takes a fair bit of skill to not muck up royally.

Figure out why they're borked. Sometimes happens because the old tapes shed/spray oxide all over the inside of the VCR. In that case, there are voodoo solutions to "fix" shedding oxide, but pretty much your best bet is heave all the stuff in the trash and forget about it, unless you have an incredibly high tolerance for frustration and lots of spare time and money. Another popular failure mode is the grease in the convoluted tape path sticks / dries out / gets covered in dust, in which case an ex-vcr rep

I'm not aware of any cable that will allow you to connect an Apple II floppy directly to a modern PC. If you still have an Apple II computer however, there is a way to dump your disks onto PC. Check out ADTPro. It's client server software that runs on Apple IIs and Java, and allows you to dump floppies to disk images over a serial connection or the tape port. Try it. It's free and it's awesome!

Unfortunately no computer to read it...though now that i think of it, the person I sold my apple clone to over 20 years ago is a hoarder so maybe he still has it. I was aware of the program but/forgot the name so thanks for saving me some searching.

SpinRite is completely worthless for modern HDDs. A long SMART selftest per year or so has the same effect. And nothing can rewrite the servo tracks. Non-spinning HDDs are unsuitable as long-term archival medium.

Some diskettes were made with an oxide binder that softens with time (basically, it absorbs moisture). I've seen old diskettes where the oxides came off on the first try leaving the head coated and at least one track of the disk destroyed.

Anyway, I've heard of people actually capturing head output so they can rebuild missing data by analyzing the output better than the average floppy controller can.

I did some investigation into the latter 20 years ago. Not that difficult or expensive if you can do it yourself. But if you do not have advances skills with electronics, that is not an option. You can try to see whether some professional data recovery outfits still offer this service, but it may be expensive. I remember a rate of 500USD/floppy.