I wonder if anyone has experience with that. Sorry if the question sounds stupid, I haven't made an attempt yet.
This is probably the most knowledgeable forum out there

The idea is to set up a 'cheap' storage of virtual machines on ESXi for offside backup.
How much time and how much workload is it to pipe ~4TB through gzip and put it via USB 2.0 onto an external storage once a week?

I'm glad you asked. Since you didn't tell us what kind of data you have, I assumed you mostly have jpeg, mp3 or avi. In short, porn. Since these are already compressed formats, the compression you're likely to see even from gzip is minimal. So, I assumed it to be 4TB.

Now a USB 2.0 can transfer at 480Mbits/s max. Ignoring effects of filesystem, other hardware and taking a leap of faith in assuming you'll get 480Mbits/s speed, you'd be transferring 3.355e+7Mbits of data. Which could take 69895.8333333s on USB2.0.

if your data is already compressed (video, mp3, jpeg etc), you won't gain anything from compression - you will lose on compression. Files on average become bigger. So if you pipe everything through gzip and half of the stuff is compressible and the rest not... well you come out with maybe a tiny bit of gain.

If it needs to be done in less than the 20h you already got as answer go either SCSI/SAS/ESATA or USB3. Whatever is cheaper for you. With U320 and ~250mb/sec (muhahaha) you will still need 4.5h...

Why laughing? because seeking will destroy even that number... (there is a reason why modern desktop hdd have a problem feeding old 10mb/sec DLT drives... they are quick until they have to seek... and everything hits rock bottom...)._________________

AidanJT wrote:

Libertardian denial of reality is wholly unimpressive and unconvincing, and simply serves to demonstrate what a bunch of delusional fools they all are.

Before you start your backup, do a recursive checksum of everything (md5sum or whatever else you prefer). Put this into a file which you also back up.

After your backup is done, checksum the backup and diff the two checksum files. (You may have to sort them - the traversal order often ends up different).

Find any differences? No? Consider yourself lucky. Yes? Examine *carefully* before doing anything rash: the error might have occurred during your 1st checksumming and the file is in fact OK. In that case, update both checksum files.

Yes, this will nearly triple the time required to do the backup and check. No, I don't know of a shortcut (short of using zfs), at least not if you value your data. (If you don't care as much, ignore this advice.)

On consumer-grade equipment, expect to see a random error every 10TB or so._________________echo 'long long long x;' | gcc -x c -c -

This is probably the best solution anyway, some Terra-Station or something with nfs support. I tested an 'off the shelf' external USB drive yesterday and it wouldn't be recognized by ESXi, it creates the node and then destroys it straight away. There are only a few USB-drives supported and known to work.

Akkara wrote:

Before you start your backup, do a recursive checksum of everything (md5sum or whatever else you prefer). Put this into a file which you also back up.

After your backup is done, checksum the backup and diff the two checksum files. (You may have to sort them - the traversal order often ends up different).

Find any differences? No? Consider yourself lucky. Yes? Examine *carefully* before doing anything rash: the error might have occurred during your 1st checksumming and the file is in fact OK. In that case, update both checksum files.

Yes, this will nearly triple the time required to do the backup and check. No, I don't know of a shortcut (short of using zfs), at least not if you value your data. (If you don't care as much, ignore this advice.)

On consumer-grade equipment, expect to see a random error every 10TB or so.

Thanks for the tip! I haven't gone into scripting yet but this is something to consider, as long as it stays below 24h ESXi doesn't have e.g. rsync per default..._________________Gentoo on Uptime Project - Larry is a cow

You are storing Virtual Machine images? Those compress decently. You really need to take a look at your dataset and determine if compression is worth it, because as said before, already compressed data usually increases in size a bit._________________we are microsoft, lower your firewalls and surrender your pc's. we will add your biological and technological distinctiveness to our own. your culture will adapt and service us. resistance is futile.