Backup Strategy: Which is Better, Tape or Disk?

Cost versus performance, of course, is the primary deciding factor when devising your backup strategy. But what if it wasn't?

Historically, tapes were high density and cheaper than disk; tremendously cheaper than disk. These days, with 2TB disks selling for a few hundred dollars, tapes are no longer the cheaper alternative. They are potentially a safer backup medium, however.

The Hassle of Tapes

Huge cabinet-size tape libraries, with robots feeding fibre channel-attached tape drives, arranged in a row so that tapes can be passed between cabinets, manually removing tapes for off-site transport? This cannot be the pinnacle of backup technology.

On a smaller scale, tapes must sometimes be swapped daily, which can be a real pain for smaller IT departments. Depending on the amount of data, the compression level, and how many tapes a changer holds, the tape changing burden can be reduced with careful planning. If, however, you wish to have daily off-site backups, tape copies must be extracted from the changer daily.

And then there is tape performance. LTO-4 tapes, running at a theoretical 240MB/s and storing 800GB uncompressed, simply pale in comparison to 384MB/s (and striped in RAID-0, data transfer rates increase linearly) 2TB SATA disks. We are at a point now where SATA storage is about the same price as LTO-4 tapes. IT shops often run into a scenario where data cannot be written to tape fast enough, and the backup window extends far into the morning. This impacts performance of live systems, and the only way to alleviate lengthened backup windows (aside from backing up less data) is to have more disk cache where data is temporarily stored before being flushed to tape.

Tapes, however, may be safer. One common backup strategy involves setting aside a certain number of tapes for quarterly snapshots. These tapes contain a full backup of the infrastructure, and are archived for 10 years or longer. With a wholly disk-based backup solution, this is not advisable. Disks fail quite reliably, and if they are powered up for 10 years you can almost guarantee that most will fail. This, then, is a perfect time to think about a hybrid model for backups.

Disk Backup Options

In the past, it was easy to argue against disk-based backups. The entire backup software infrastructure would have to be scrapped, as most major solutions from Veritas, Sun, and others did not support anything other than tape libraries for the final storage location. Now, all major backup solutions support raw disk storage in addition to tapes.

Disk backup is exactly what it sounds like; instead of writing out backup archives to tape, it stores them on disk. Advantages include:

Much faster data transfer rates

Versatility and ubiquity

Flexibility, in that remote backups can be transferred much easier

Instead of paying a vendor to physically pick up tapes or host remote tape libraries, with disk backup you can easily place a server and SATA array in a remote location to accept copies of backup data. Not only does this save money, it also saves time and lessens the maintenance burden.

Backups serve as a point-in-time snapshot, preserving data and protecting it against accidental deletion. If however your goals are merely to protect against hardware failure, a large part of your backup strategy can exclude tapes. Real-time file system replication with tools like DRBD in Linux enable easy multi-site replication to protect against hardware failure. Likewise, some distributed file systems also allow for multiple copies to be stored, avoiding a single point of failure. Regardless of replicated data redundancy, point-in-time snapshots for backup purposes must be taken. Accidental deletion and file system corruption could lead to a loss or corruption of data, which would be replicated. Without a real read-only backup, data would be lost forever.

Virtual Tape Libraries

Initially to compensate for backup software being unable to speak with anything but magnetic tape drivers, various virtual tape libraries started showing up on the market. Virtual tape libraries generally contain tons of disk space, as well as the capability to "speak tape" with your back-end tape library infrastructure. They sit in-between the backup host and tape library, acting like a tape library. Policies can be set to configure how often (if ever) to flush data to tape.

Virtual tape libraries do offer disk-based storage with RAID assurance against a single disk failure, but their usefulness is debatable if your backup software is already disk-aware. Having to configure both backup scheduling in the backup software, and disk-to-tape aging rules in the virtual tape library, is a hassle. Most likely, your backup software can handle everything now.

The decision, in the end, is decided by a few surprisingly simple factors, since disk and tape prices are roughly similar right now. Tape infrastructure, however, is the deciding factor. New disks can be added to very cheap storage arrays or even servers themselves, but more tape space requires towers, robots, and tape drives: oh my!

Do you overrun your backup window because tape transfer speeds just aren't fast enough? You can buy more of those expensive tape drives and optimize the backup rotation, or switch to disk-based backups.

Do your spend an inordinate amount of time swapping tapes due to lack of tower space? Does your tape infrastructure need upgrading? Your previous investment will not be wasted if you switch, as you will certainly want quarterly or monthly snapshots on tape.

Are you physically shipping tapes off-site? Instead of investing in tape technology or a hosted solution in a remote data center, switch to disk-based backups.

Charlie Schluting is the author of Network Ninja, a must-read for every network engineer.