Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

A lot of sales may be linked to the growth of Electronic Medical Records, which are going to be required for full Medicare reimbursement in a few years. Database backups are a must for those systems and tape is probably the most cost-effective answer.

I've used it for 25 years, with a variety of vendors, capacities and dozens of drives, and every single unit I've ever had failed, not only the tapes themselves, but also the drives. People can't remember to cycle tapes, tapes die and people don't notice, and you can't buy the tapes 3 years down the line.

Disk is much simpler, and more robust.

We finally realized that we were backing up a very reliable media with a very un-reliable one.

Finally we switched to compressed backups stacked on cheap redundant network attached disk drives in small external enclosures. They can sit anywhere, even INSIDE the fireproof vault.

The software for this is readily available from a number of sources and you can use your same "tower of Hanoi" media cycling schemes as you might for tape. Because backups are bundled into one large file they can be stored, cataloged, archived, rotated, and purged via automated means.

Example: My last gig was with the DOE working on the US site for the CMS experiment for the Large Hadron Collider. We had around 5PB of spinning disk and 17PB of LTO4 tape storage for the detector data (you can't really backup 17PB offsite for a reasonable cost). We'd have bad tapes quite often, and it didn't matter if you did a verify at the end of the tape write before it was stored by the robotics.

Amanda works by backing up filesystems to dump files on the backup server, then writing those dump files all in one go. It might take an ancient system an hour to spool its dump to the backup server, but the tape doesn't have to worry about that.

Yep, but that's what it takes to keep it running above minimum speed, writing a single large filesystem volume takes longer than writing 4 volumes almost as large because the drive shoeshines with a single job. Minimum rate for the drive is 40MB/s, the best I have done with tuning on a 72 drive vraid6 volume is 35MB/s and average is closer to 25MB/s sustained but 4 jobs from the same array on different volumes will give me 100-120MB/s. All volumes are spanned across all disks so it's not a matter of more spindles being available, it's the latency in all the metadata lookups.

While the data isn't backed up all in one spot, it can either a) be reconstructed from other data, b) regathered from the 800+ other facilities we distribute chunks of the data to, or c) recollected. It's cheaper than the $8-12 million it would cost to backup all 17PB offsite (and that's taxpayer money).

IBM recently announced LTFS (Long Term File System), which allows one to operate LTO-5 tapes as if they were a normal file system.

That's a very exciting technology which allows for the standardization of tape formats -- its specs are freely available in the LTO Consortium website [trustlto.com] and the implementation has been released under the GNU LGPL (see the LTFS website [ibm.com] for links).

The media may be cheap, but the drives are expensive and sometimes proprietary. So you'd best be a big enough outfit to buy at least multiple drives. Not to mention that you need to replace tapes regularly. At $2000/drive and needing at least three, plus needing 60 tapes per year at $30ea... you could buy around 30-40 1TB hard drives, with carry cases or trays. And you need to lay out that $7500 right at the start, plus the $1800/year. That's a lot of money for a small business with under 20 employees.

(And most tape drives are more like $3k to $4k each.)

The big problem with tape for smaller shops is simply up-front cost. For $150, they can buy a single 1TB drive and use that to write backups to. Each week, they buy a new drive until they are rotating 5 or 6 of them. Or if they really need to get data offsite daily, they'll do a delta-backup over the WAN links. Or spend enough to have 10 hard drives in rotation.

- most backup software is designed to deal with tape libraries, not so much with shuffling B2D media around

- most archive companies are built around storing tapes; though I suspect there are ones which could deal with hard disks in external caddies

- tapes deal with stress from being transported continuously better than mechanical drives (also wear and tear of plugging and unplugging the interfaces all the time)

- I think unused tapes age better than unused hard disks, but I've nothing to back that up

Bandwidth to the tape drive itself rarely seems to be an issue for actual backups, since network and file I/O latency seem to be more significant issues. We never get anywhere near the maximum speed out of our LTO-4 drive, even when we're just duplicating data from the local array to the tape.

That's not much use if you want to be able to restore the individual files from the backup, which is nearly always desirable.

Disaster-recovery-only backups are okay, but if you're spending the money to archive your data you normally want a bit more flexibility.

Additionally there's the obvious problem of taking the server offline while you do the backup...

If you're pulling individual files off of tape, you're probably doing it wrong.

Backup across the network, to disk, first. You can build or buy a wide variety of arrays to do this for less than your tape drive costs, on average. Go large and rotate the storage mount points. We keep five days 'on line', and overwrite by schedule.

Write THAT data to tape, to be sent offsite.

On the upside, you can get any file from the last five days in less than an hour, without leaving your desk. More like fifteen minutes, really. Disks are for retrieval, tapes are for archive and disasters. Very clean, very simple, auditors love it.

On the downside you've doubled your costs, have additional overhead, and are probably adding lag to your tapes by extending the time-to-tape by a full day.