Some companies went completely out of business due to an inappropriate/in-existent backup process (notice: I said "process", not software or script). Today, I will review how I'm currently archiving "nearline" backups, but I will purposely not cover storage, software and tape inventory in this thread, simply hardware, the reason for sharing this is not simply that it could potentially benefit others, it is also that, in my own selfishness perhaps, I hope some might help me improve the current implementation.

As far as the software is concerned, it is supposed to fit one's needs; therefore the solution you adopt on any system should fit your own needs and retention policy; my needs are likely to be very different from yours; which is why I'm not going to give details on which software is being used; although I will give the following short overview:

Backups are handled in multiple ways here, nearline storage gets replicated remotely, and backups are performed on an "large SGI", we use Amanda for some servers, as it worked well for us in the past; but also some specific storage archival software (commercial) which works in adequately with our NFS and FCP servers (no, not ftp; fcp: fiber channel protocol).

The tape drive used is a HP LTO-4 (model 1760, mounted externally, scsi id #3, no terminator). For the sake of testing, I plugged this same drive on a few machines using few different cards and performed a few tests (using a LTO-4 tape), I will review the results I got, archiving a 45GB directory.

The cpu load on the tezro was 100% on only one cpu during tar (~10% on the altix, with 2% IOwait), so I think this is the current bottleneck. Tar (at least the GNU implementation) can not be threaded, however, thanks to hundreds of thousands of users using it (on different architectures and operating systems), it is reliable, a few efforts to replace GNU tar have been done already and it seems that all failed so far.

During the xfsdump test, the load on the tezro was shared amongst processors, which explains the discrepancy; nevertheless, less people run xfs (and xfsdump) than tar, so I'm not completely sure this would be more reliable than tar.

Regarding APD (asynchronous personality daemon), the tests were done on a tezro with apd installed, but disabled. For consistency, test #8 was done on a tezro where apd has never been installed.

mia wrote:My conclusions on the results above are as follows: 37MB/s is bearable, if not satisfactory.

Be interesting to see how a fiber channel connected tape drive would fit into those results. An FC connection might also allow the use of a fiber channel switch to share a single tape drive between multiple systems.

To test both of those ideas I recently acquired (but haven't yet tested) a 1GB FC-to-SCSI bridge from the Keeper. That one is a bare PCB I plan to install in a 2U tape drive enclosure, but FC-to-SCSI bridges can be had in free-standing rackmount enclosures. I haven't done any research into compatibility or function, so solely as an illustration of the type of device I'm describing, here's an example.

I would like to move to LTO-5 or LTO-6, although, external or rackmount (no library) FC drives are difficult to find, even LTO-4 ones; it's all SAS now. Note that I have APD installed but disabled. In any case, I don't think APD supports LTO-4 drives.

I'm as surprised and confused by this discrepancy; I have removed APD on this machine; yet I am unsure why I do not get identical results, similarily I get identical results on another Tezro, but this could be related to the fact that both (and the others) were installed from the same Irix netinstall.

Recondas, I have followed your procedure, to the same outcome.

Hhoffman, which throughput do you get, when writing to a tape, either with tar or xfsdump (or dd, really), and which scsi card are you using?

My mt status reports this following firmware: Ultrium 4-SCSI W51D, while Hhoffman is running W54D. This is the only possible explanation at this time.

I've upgraded the drive's firmware to W62D, which is the latest firmware available for the 1760; and the repeated the test on a very fast Linux/amd64 machine; the results are astounding: 19m04s (39MB/s) for a tar archive of this same directory (which was placed on very fast storage).

I have redone the "tar" test using this new firmware, on the Tezro, and was able to do it in 24m00s (31 MB/s) which is really good, so I'm fairly certain the firmware changed a few things, so I'm going to try different scsi parameters now.

I've repeated the tests, this time with a different tar blocking factor, as such:

timex tar cvb 4096 -f /dev/rmt/tps4d3 /path/to/files

Tezro: 15m53s (47MB/s)"Very fast" Intel*: 19m04s (39MB/s)

Now I'm happy. I'll run multiple tests overnight to collect some statistics with different blocking factors; this is really much better, now I can write a 800GB tape in less than 5 hours. The payload written to tape is hardly compressible, so I'm not going to bother trying compression on this dataset, this will be the subject of another test.

Yes of course. I tried to get support from hp for the unit with IRIX, but they where not willing to help. My version for the SCSI file simply came out of the LTO 3 version in the 'LTO Anyone thread'.Anyway, with my scsi file entry, I noticed some issues with tar (original IRIX version) and mt. But cpio (with the IRIX backup interface) and gnu tar worked fine. With mia's version, these issues disappeared.