Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

Is there a way with rsync to copy a large directory to multiple USB drives?

I have a directory that is 8.2 TB large and we need to get this to the client and the client only wants 2 TB drives.

So my question is. Is there a way to use rsync to fill up the first USB drive and then have rsync ask for or after manual change of the USB drive for rsync to know where it left off and start copying data to the second USB drive at the point it left off on the first USB drive?

But my real solution for you is to inspect the source directory by hand and break the content up by subdirectories in a way that makes sense. For example, you might have /src/engineering_files, /src/hr_docs, /src/ceo_data. Just sync each tree to a separate disk by hand.

Also, tar is good at creating multi-volume archives. There might be some solution with tar that works for you.

It seems to me that is this almost identical to the old problem of packing files in limited news and mail messages.

Have a look at the various SHAR (shell archive) programs (there were lots of them), which not only packaged files, but also split them into groups with a total size limit on each group. In some you to extract specific files from one specific group. Files that were too big were split into smaller segments over multiple 'messages'.

I am sure that software is still around.

For another method you could try the RAR archive. whcih generates a large split archive.
It can also recover files from have a few 'pieces'.

Please be sure to let us know whatever solution you do come up with!
It has a lot of relevance, not just to USB sticks, but CD and DVD data storage
as well.

ASIDE: this is actually known as a 'packaging' problem and has been shown to be NP-complete programming problem. That is there is no one 'perfect' solution that does not take a polynomial time calculation. However todays computers are fast enough that typically this is no barrier for any 'practical' situation.

Will keep you posted. Currently I am working on a Perl script. If this works is there a good place to post something like this so the masses can have it? Is there anything else that needs to be done to the script before putting to general use (i.e. putting GNU info in it etc.)?

### rsync the files to destination source and output to a file for reading ###
#`$rsync $bdir$sdir/$diff $ddir1 >> $logdir$ofile1$datestring.txt`; ### Use this line if you want to log it
`$rsync $bdir$sdir/$diff $ddir1`; ### Comment this line out if you use logging
print "$rsync $bdir$sdir/$diff $ddir1\n";
}
### Changed this line to add 2nd if condition ###
else
{
print "Free disk space (KB): $free2\n";

### rsync the files to destination source and output to a file for reading ###
#`$rsync $bdir$sdir/$diff $ddir2 >> $logdir$ofile2$datestring.txt`; ### Use this line if you want to log it
`$rsync $bdir$sdir/$diff $ddir2`; ### Comment this line out if you use logging
print "$rsync $bdir$sdir/$diff $ddir2\n";
}
}
}
}
else
{
### Print to screen that there is nothign left to rsync ###
print "\nNothing to rsync\n\n";
}

No did not know this, good to know. The biggest issue I had was using rsync to copy files from disk to USB then when USB #1 fills up roll over to the second USB drive and so on. If rsync will do this as well please by all means post the command line arguments LOL.

It is a intergral part of rsync to only transfer the changes. It was specifically designed with slow modems in mind. This is what makes it different to a normal 'file copy' such as scp, cp, tar, cpio, and so on.

Rsync only replaces files on the destination (breaking any hardlinked copies), if a file data changes, which is why you can create large numbers of 'snapshots' (even once an hour) using very little disk space.

Such rsync backups are not compressed, which allows each snapshot to be look almost exactly like a simple full working copy of the directories that were backed up. That is, it is easy to search, and access any file in any snapshot. You do not have do searching multiple incremental compressed backup files just to recover a specific bit of data, prehaps without knowning the exact filename that data is in. Just search for it directly as you normally would, across all the snapshots. It is the hard linking of unchanged files that gives a rsync multi-snapshot backup method such a good compression.

However hardlinks only work on the same disk storage mount, so each USB would have to have at least one full copy of the files being backed up. Also hardlinked snapshoting will require... hard links.. which requires a UNIX style filesystem. USB sticks typically only use a low level VFAT filesystem (no hardlinks, and DOS file attributes) for maximum compatibility.

As such USB sticks may need a different filesystem for it to work well. And larger USB drives with say a EXT4 filesystem tends to work better. It allows more hardlinked snapshots from the initial full copy (or last snapshot depending on how you look at it), and this higher disk space savings (hardlink compression) per snapshot.

ASIDE: The use of a cloud based filesystem (like dropbox) also precludes the use of hardlinks. As such snapshoting to such a filesystem does not compress well as you do not get hardlink sharing of files accross individual snapshots.

However making snapshot backups on a local machine, of a (prosibly encrypted) cloud based 'working' filesystem that can be shared accross devices, should work very well.

That one local machine keeps 'snapshot backups' (perhaps working automatically in the background), while the cloud allows access to the actual working directory from multiple locations.

If something happens to the cloud, or your working directory gets corrupted for some reason, you have your highly-hardlinked snapshots to recover from. It will be straight forward then to copy the last good snapshot to a new replacement cloud provider.