Monthly Archives: November 2013

I’ve recently heard many folk talking about Time Machine backup strategies. To do it well, you really do need to backup your backup, as Time Machine can “eat itself”, especially doing network backups.

Regardless of whether your Time Machine backup is to a locally attached disk or a network drive, when you make a backup of your backup, you want to make sure it’s valid, otherwise you’re propagating a corrupt backup.

So how do you know if your backup is corrupt? You could read it from beginning to end. But this would only protect you from data corruptions that can be detected by the drive itself. Disk verify, fsck, and others go further and validate the file system structures, but still not your actual data.

There are “silent corruptions”, which is where the data you wrote to the disk comes back corrupted (different data, not a read error). “That never happens”, you might say, but how would you know?

I have two servers running SmartOS using data stored on ZFS. I ran a data scrub on them, and both reported checksum errors. This is exactly the silent data corruption scenario.

ZFS features full checksumming of all data when stored, and if your data is in a RAIDZ or mirror configuration, it will also self-heal. This means that instead of returning an error, ZFS will go fetch the data from a good drive and also make another clean copy of that block so that its durability matches your setup.

Here’s the specifics of my corruptions:

On a XEON system with ECC RAM, the affected drive is a Seagate 1TB Barracuda 7200rpm, ST31000524AS, approximately 1 year old.