If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Slow backup (1MiB/s) of local machine (large DLE)

Ok, I am just starting w/ Amanda per my post here ([url]http://forums.zmanda.com/showthread.php?t=2227[/url]) I have some large systems that I want to have a backup system locally attached to them (due to their size). Locally I can back up to tape (LTO4) with GNU Tar (v1.19 / ubuntu 8.04.3) at ~105-110MiB/s no problems. The drive array consists of 64 drives on hardware raid controllers using LVM to stripe the various raid-6's. RAW I/O to the volume is over 500MiB/s.

When using amanda with the configs below, I see that the estimate phase takes a very long time (fails to complete on /var/ftp) and from iostat seems to be doing only 1-2MiB/s? which could very well be a problem in that regard. When dump finally /does/ start on the disks the backup is about 1-2MiB/s (nearly 10 hours to back up less than 100GiB).

I did notice that amanda does NOT set any blocking factor to tar, so I wrapped tar (created a script that would add the blocking factor of 1024 (512KiB) which I use for direct tar backups and gave me the >100MiB/s numbers above.

Since this is so pathetic, I am hoping that there is something that someone can see here as I just can't believe that amanda or any backup utility could be this bad by design.

I also noticed that all the taper 'PART' sections of the log mention 10240 kbp which may be a hard coded limit somewhere but can't find it or even if that is important (does seem very low though).

----(amanda.conf)
org "BackupSetAA" # your organization name for reports
mailto "xxxxxx" # space separated list of operators at your site
dumpcycle 12weeks # the number of days in the normal dump cycle
runspercycle 12 # the number of amdump runs in dumpcycle days
tapecycle 40 # the number of tapes in rotation
runtapes 40 # number of tapes to be used in a single run of amdump
tpchanger "chg-manual" # the tape-changer glue script
tapedev "/dev/nst0" # the no-rewind tape device
changerfile "/etc/amanda/BackupSetAA/chg-manual.conf" # tape changer configuration parameter file
changerdev "/dev/null" # tape changer configuration parameter device
tapetype LTO4-HWC # what kind of tape it is
labelstr "^AA[0-9][0-9][0-9][0-9]*$" # label constraint regex: all tapes must match
dtimeout 1800 # number of idle seconds before a dump is aborted
ctimeout 30 # max number of secconds amcheck waits for each client
etimeout 3000 # number of seconds per filesystem for estimates

Ok, after much playing I got it to about 40MiB/s which is better but still nowhere near the speeds of a raw TAR backup and right at the starvation point of the tape drive.

Another item I noticed is that program "DUMP" does not work w/ XFS volumes. Seems that 2.6.1p2 tries to use /sbin/dump for /any/ filesystem regardless if it's correct or not (or doesn't properly see filesystem type).

So, looks like there is still a major performance problem somewhere though I can't seem to find it in any of the docs nor anyone really talking about pushing real speeds w/ amanda. (i.e. >100MiB/s)

So far I've found that a holding disk slows down the backup or at best has zero positive effect. No DLE will ever fit on a single tape. As for XFS, I tried self-compiling but then found that even by manually linking it still seems to go toward /sbin/dump for file systems. I manually linked xfsdump to /sbin/dump but then I found that it does not take the xfsdump -e option to exclude files.

Currently the best performance I can get is with the config below, which frankly, sucks (1/4 - 1/5 the performance of a raw gnu tar v1.19 backup of the same volumes) consistent.

Only the large partitions are XFS the others (/ ; /boot ) are ext3, /home & /var/ftp are both xfs.

the compilation found XFSDUMP but for some reason it kept trying to use /sbin/dump?

Anyway, I have since removed amanda due to mainly the performance issues, I have moved over to Bacula which I am getting 105-115MiB/s performance out of it with very little cpu. (<50% of two 2.4Ghz cores). With these figures I can add in another 1-2 more LTO4 drives easily without affecting file serving.