Changelog:

Made a plain-C package available, to support 64-bit OSes (as well as OS/X and Cygwin users).

March 9, 2009:

Made a FUSE-based filesystem that transparently uses these tools.

June 14, 2010:

Fixed memory issues reported by Valgrind - now works with all GCC versions.

Shield my files? Why?

You know why!

Have you never lost a file because of storage media failure? That is,
have a hard drive or USB stick lose a bunch of sectors (bad sectors) that
simply happened to be the ones hosting parts (or all) of your file?

I have. More than once... :‑)

The way storage media quality has been going in the last years,
it is bound to happen to you, too. When it does, believe me,
you'll start to think seriously about ways to protect your data.
And you'll realize that there's quite a lot of technology to choose
from...

"Backup, backup, backup! And take it with you!"
This is a valid and wise suggestion, but it doesn't address the
details of backing up... There's more than one way to backup ;
my personal favorite has a lot
of features you'll probably like. Unfortunately,
backups themselves are also stored in some kind of storage,
so the question is how are you certain that your backup storage
won't fail? And equally important, how often will you backup?
If you do it once per week, you might end up losing a week's
worth of work - is that acceptable in your line of work?

Others will advocate RAID. Use a RAID scheme on more than one
disks, and when one fails, the machine will keep on working
with the rest - at least in theory. In practice, faulty RAID
controllers (especially the on-board garden variety) can
wreck havoc just as much as the faulty storage media can.
If you decide to go for RAID, I suggest you use your OS
support for software RAID (e.g. Linux md), and I also suggest
using the simplest possible building blocks: RAID1 (mirrors),
or if you really need speed, RAID10 (stripes of mirrors).
Even with RAID though, nothing can protect you from "silent corruption"
errors - e.g. IDE cable errors between the controller and the
drivers, faulty power supplies, etc.

Then again, neither backup nor RAID would save you from
accidental deletions or file corruptions. Today's word processors
(thinking of Mr. Clippy, not LaTEX) are such complex beasts
that their crashing is considered a normal everyday activity
(which is why they were "enhanced" years ago with periodic
auto-saves).
On one of these crashes, chances are you'll find your document
corrupted. A solution, you ask? Simple: Use version control...
Subversion or Git are wonders of the world - the former even has
nice GUIs for non-technical folk. You can then recover from deletions and
corruptions, since your repository would provide the file again.
Then again, you may be forced to work in "lone wolf"
mode... Working with your laptop in airport lounges and
dark, secluded caves (known as "hotels"). Access to the web
may be missing or firewalled, and therefore there may be no way to
hook your laptop to your company's repository...

Burning to re-writable DVDs? Chances are that when disaster strikes,
you will find your precious backup DVD is scratched... Or that your
USB stick in your keychain didn't survive the constant scratching
from your keys...

My point?

There's no such thing as "enough protection" for your data - the more
you have, the better the chances that your data will survive disasters.

What follows is a simple description of a way I use to additionally
"shield" my important files, so that even if some sectors hosting them
are lost, I still end up salvaging everything.

Algorithm

The idea behind this process is
error correcting codes, like for
example the ubiquitous Reed-Solomon.
With Reed-Solomon, parity bytes are used to protect a block of data
from a specified maximum number of errors per block. In the tools
described below, a block of 223 bytes is shielded with 32 bytes of parity.
The original 223 bytes are then morphed into 255 "shielded" ones,
and can be recovered even if 16 bytes from inside the "shielded"
block turn to noise...

Storage media are of course block devices, that work or fail
on 512-byte sector boundaries (for hard disks and floppies, at
least - in CDs and DVDs the sector size is 2048 bytes). This is why the shielded stream must
be interleaved every N bytes (that is, the encoded bytes must be
placed in the shielded file at offsets 1,N,2N,...,2,2+N,etc):
In this way, 512 shielded blocks pass through each sector
(for 512 byte sectors), and if a sector becomes defective, only
one byte is lost in each of the shielded 255-byte blocks
that pass through this sector. The algorithm can handle 16 of
those errors, so data will only be lost if sector i, sector i+N,
sector i+2N, ... up to sector i+15N are lost! Taking into account
the fact that sector errors are local events (in terms of storage space),
chances are quite high that the file will be completely recovered,
even if a large number of sectors (in this implementation: up to 127
consecutive ones) are lost.

I implemented this scheme back in 2000
for my diskettes (remember them?). Recently, I discovered that
Debian comes with a similar utility called rsbep, which
after a few modifications is perfect for providing adequate
shielding to your files.

Download

Here is the source
code for my customization of rsbep, a utility that implements
the kind of Reed-Solomon-based "shielding" that we talked about
(the customized code is also
available from my GitHub repo).
The package includes 32-bit x86 assembly that makes it an order of magnitude
faster than plain C ; if however you are not on a 32bit x86 platform,
it will fallback to a portable C version instead
(a lot slower, unfortunately).
rsbep is part of dvbackup, so some Debian users might
already have it installed; my version however addresses some issues
toward the goal we are seeking here, which is error-resiliency
for files against the common, bursty types of media errors.
More information on what was changed is below.

The package is easily installed under Linux, Mac OS/X, Windows(cygwin)
and Free/Net/OpenBSD, with the usual

For those of you that don't speak UNIX,
what you see above is a simple exercise in destruction:
we "shield" a file with the freeze.sh script,
which is part of my package; we then melt.sh
the frozen file, and verify (through md5sum)
that the new generated file is exactly the same as the
original one. We then proceed to deliberately destroy 64KB of
the shielded file (that's a lot of consecutive sectors!),
using dd to overwrite 127 sectors with zeros. We invoke
melt.sh again, and we see that the new generated
file (data3) has the same MD5 sum as the original one - it
was recovered perfectly.

Reed-Solomon FS (a FUSE-based filesystem)

Based on these tools, I did a quick implementation of a Reed-Solomon
protected filesystem, using Python/FUSE bindings:

bash$ poorZFS.py -f /reed-solomoned-data /strong

This command will mount a FUSE-based filesystem in /strong (using
the /reed-solomoned-data directory to store the actual files and
their "shielded" versions). Any file you create in /strong, will
in fact exist under /reed-solomoned-data and will also be shielded
there (via freeze.sh). When opening for reading any file in /strong,
data corruption is detected (via melt.sh) and in case of corruption
the file will be corrected using the Reed-Solomon "shielded" version
of the file (which is stored alongside the original, and named as
originalFilename.frozen.RS). The .frozen.RS versions
of the files are not visible in the /strong directory, and
are automatically created (in /reed-solomoned-data) when a file
(opened for writing or appending) is closed.

I coded this mini-fs using Python-FUSE in a couple of hours on a boring Sunday
afternoon, so don't trust your bank account data with it... It's just
a proof of concept (not to mention dog-slow - due to the necessary data interleaving).
Still, if your machine is only equipped with one drive, this will in fact
transparently shield you against bad sectors, faulty power supplies,
messy IDE cabling, etc.

Note: I coded this filesystem adding 20 or so lines of
Python (spawning my freeze/melt scripts) into the Python/FUSE basic
example. Anyone who has ever coded a filesystem driver for Windows knows
why this justifies a heart attack - FUSE (and Python/FUSE) rock!

Changeset from original rsbep

In case you are wondering why I had to modify rsbep
here's where my version differs from the original...

The original version wrote 3 parameters of Reed-Solomon
as a single line before the "shielded" data, and this made the
stream fragile (if this information was lost, decoding failed...)

It uses a default value of 16*255=4080 for parameter R,
and it can thus tolerate 4080*16=65280 consecutive bytes
to be lost anywhere in the stream, and still recover...

It adds file size information in the shielded stream,
so the recovery process re-creates an exact copy of the
original.

I added autoconf/automake support, to detect whether a fast 32bit x86
asm version can be used and otherwise fall back to a plain C (slow)
implementation. The tools thus compile and install cleanly on many
operating systems (Linux, Mac OS/X, Free/Net/OpenBSD, even Windows
with Cygwin).

Python-FUSE support.

Conclusion

These tools works fine for me, and I always use them when
I backup data or move them around (e.g. from work to home).
As an example, when I move my Git repository around,
I always...

The comments on this website require the use of JavaScript. Perhaps your browser isn't
JavaScript capable or the script is not being run for another reason. If you're
interested in reading the comments or leaving a comment behind please try again with a
different browser or from a different connection.