If It's Instant, It Isn't A Backup

Pretty much every storage vender supports the concept of snapshots. Often these storage snapshots are presented to companies by the vendor as a way to take SQL Server backups without having to worry about eating large amounts of space. Some companies will even use this instead of tape backups. Due to the way these snapshots work they do not make reliable backup.

Remember for a moment what the cause of the large majority of database corruption is … it is a problem with the storage, where the storage either incorrectly writes a block to the disk or a physical disk fails causing the data to be lost. When using snapshots, the snapshots use a technique called “copy on first write” where, as blocks are written to for the first time by the disk array, the original block is copied to another location to be stored as part of the snapshot. Because the snapshot isn’t a full copy of the volume, and at first contains no blocks with just the new blocks being written, the snapshot isn’t a full copy of the database. As the snapshot isn’t a full copy of the disks the snapshot can’t be trusted to be a valid backup. If the source disks were to fail and the data on the production disk (where the database is stored) was lost, the snapshots would be useless as the blocks which haven’t been written to yet would have been lost.

The same would apply if a block which hasn’t been written to before, and therefor does not exist in the snapshot was to become corrupt, either by a cluster on the disk going bad, or the array writing data incorrectly to the disks, etc. the block wouldn’t be available for backups and the backup would not be valid.

This risk can be reduced but not eliminated by replicating the LUN which contains the storage we want to use snapshots for a backup as to another storage array, then taking the snapshots on that synchronously replicated array. This however may not completely protect you. If the production array (the one that the SQL Server is talking to directly) was to write a bad block to the array, that bad write could be replicated to the remote site.

The next problem with only using snapshots for backup is that you only have a finite amount of space which can be used for backups as SAN storage is very expensive. In the long run tape is much cheaper for long term storage as the tapes can be sent offsite and kept for effectively ever. Having the same data retention policy on active storage simply isn’t a cost effect solution for most people as eventually your snapshots will end up being the same size as your source disk so for a 100 Gig database, and you have snapshots which are taken daily, and you keep one years’ worth of snapshots you will end up having terabytes of space tied up in the snapshot backups, which could end up costing hundreds of thousands of dollars in disk space. With tapes on the other hand the tapes are very inexpensive and can hold hundreds of gigs per tape which can then be sent offsite to another location (usually via an offsite storage company) for long term safe keeping.

Discuss this Blog Entry 1

While I agree that the storage vendor's snapshot utilities do not create a reliable backup for databases, I'm pretty sure that you have not described snapshot properly. A snapshot DOES provide a reliable copy of every block, either by copying them all, if no baseline backup exists, or copying changed blocks, if there is a baseline. A snapshot that only captures new writes, without reference to a baseline backup, would obviously be unusable for any kind of restore (databases or just the simplest text file.)
It's true that in order to provide a semblance of consistency, snapshot DOES capture writes that occur during the snapshot process.. It does this so that it can capture the old values before they are written, assuring that it the backup corresponds to a moment in time.
You may have confused the notion of snapshot with a continuous (aka live or realtime) backup. Still, even this kind of backup depends on a baseline backup. Also, it writes the NEW block values, so that the backup has the latest data.
The reason that snapshots are not usable as database backups is not because some blocks were not backed up. Rather, it is because there is no guarantee that at a moment in time a single database data file is internally consistent. On a live database, the transaction log must be used with the data file to guarantee consistency.