topics = why backup, what to backup, how to backup, backup devices and media

12.1 Why backup?

Backups are needed in case a file or a group of files is lost. The reasons for
losing files include

Hardware failure like disk breaking,

accidentally deleting wrong file and

computer being stolen.

Backups help in all the above situations. In addition, it may be good to have
access to older versions of files, for example a configuration file worked a
week ago, but since then it has been changed and nobody remembers how, its just
not working anymore.

There are other solutions, and they are good to have if you can afford them.
These include

redundant disks (RAID 1 or 5), so that one disk can break without loss of data,

using an undelete system (or not making mistakes when deleting files :-) and

locking up computers.

These help, but if there is anything you do not want to lose on the computer,
make sure there are backups and they can be restored.

This chapter is more of a general nature than the others, which are specific to
the Debian GNU/Linux. There are so many different backup devices and backup
software, that it is difficult to go into details without assuming for example
a SCSI tape drive and GNU tar used to write backups. Chapter Floppy, Section 12.3.1 gives three detailed examples,
use them as a guide in doing backups to other kinds of media.

12.2 What to backup?

If there is room on the backup media, and time limits permit running backups
long enough, it probably is wisest to back up everything. You may skip
/tmp or other places where it is known there are only temporary
files that nobody wants to backup.

If space or time limits place restrictions, consider not backing up the
following:

Files that come directly from a CD or other removable media. It may even be
faster to copy them again from CD than restoring from backup media.

Files that can be regenerated easily. For example, object files that can be
made with make. Just make sure all the source files and compilers
are backed up.

If the Internet connection is fast, it may be easy enough to download files
again. Just keep a list of the files and where to download them from.

12.3 Backup devices and media

You need some media to store the backups. It is preferable to use removable
media, to store the backups away from the computer and to get
"unlimited" storage for backups.

If the backups are on-line, they can be wiped out by mistake. If the backups
are on the same disk as the original data, they do not help at all if the disk
fails and is not readable anymore. If the backup media is cheap, it is
possible to take a backup every day and store them indefinitely.

The following subsections discuss different kinds of backup media. As a
hands-on example, a floppy is used to back up directory trees in different
ways. Use these examples as a guideline when using other medias.

12.3.1 Floppy

Floppy disks are cheap, and on PC computers there usually is a floppy disk
drive. On the other hand, it is not very fast to write to a floppy, and the
capacity of 1.4MBytes is not very much. If the backup does not fit on one
media, taking backups becomes an arduous task, what with having to stand by and
change the media every now and then.

However, if the data to be backed up fits in one floppy, they are a reasonable
alternative. Compressing the data usually means it uses about 50% of the
original size. Thus, you can expect to get almost 3MBytes to one 1.4M floppy
disk with compression.

Examples use tar, because it is available on almost all Unix
versions and also on some other operating systems, it can preserve file
ownerships and date stamps and write directly to device or to a file.

There are different ways to use the floppy with tar:

Create a Linux filesystem on the floppy, mount, write like to any Linux disk
and unmount the floppy,

write directly to the device /dev/fd0 and

use the floppy with DOS file system, and copy the tar file there with
mcopy (see info file mtools with command info
mtools).

These three methods correspond to three different classes of media:

Random access or direct access, mostly disks. The media is like any disk or
directory tree, it is possible to do ls, cp and other
commands accessing files. It is easy to restore one file from the backup media
by simply copying it back.

Serial access, like a tape drive. Reading or writing the media starts from the
beginning and goes to the end. It is not possible to start directly in the
middle.

Media that is a DOS file system. This is readable on all kinds of operating
system, which is useful if you need to read the backup on some other kind of
computer.

12.3.1.1 Backup example 1, suitable for disk media

Here is an example to backup the ~/Work/Debian-doc directory tree
to floppy. First check how large the directory tree is:

This example uses the floppy as removable media with Linux filesystem.

Now a minix filesystem is created on the floppy. This is the filesystem Linux
uses for floppies and other small media. Note, that all data on the media is
lost when it is formatted. The first floppy drive is device
/dev/fd0.

Now the floppy disk is mounted and backup is run and tested. Note, that the
mount point /A must already exist (see mount(8). To
allow an ordinary user tale write access to the filesystem on this
floppy, the owner and group owner are changed.

Now tar is used to copy the whole directory tree to the floppy.
It is better to use tar, it preserves file ownerships and
permissions. If you try to use cp -r you will notice the backup
is not identical.

We could read the listing (option t for tar) from the
floppy, so we can assume the backup is OK. Note, that if you plan to read this
floppy back later, make sure you remember how the floppy was
written. If you forget it was written with tar and with
compression, you spend a lot of time figuring it out. This same applies to all
media that you store for any longer period of time, and especially if you send
the media to someone else.

The above method is usable also with tape drives. Replace the device name
/dev/fd0 with device name for the tape drive, and you can use
tar as Tape ARchiver.

If you have or want to use DOS formatted floppy disks, it is possible to use
them like in example 1, mount them with flag -t msdos to inform
the mount command. But in this example, we us DOS floppies with
the mtools -commands.

The backup is not written directly to floppy, it is first created on
/tmp directory and copied from there to floppy with command
mcopy. For more information, use info mtools.

This method can be used when the backup needs to be read back on some other
computer, not necessarily running Linux. This is also useful for sending files
to some poor soul not running Linux. In this case, it is better not to use
tar, programs like zip, gzip and
zoo are available on most operating systems.

12.3.2 High Capacity Floppies

All of these are about floppy disk size, store 100MBytes to 200MBytes and are
faster and more expensive than floppy drives. They are connected to EIDE port,
Parallel port, SCSI or USB. LS-120 and Sony SuperFloppy can read and write
ordinary 1.4MByte floppy disks.

You can use these things like in the above examples where a floppy disk was
used, but you have to install the devices and the device driver software before
they can be used. Then the device name depends on what kind of connection the
thing uses.

There is more info on using the above high capacity floppies on Linux in the
HOWTO documents. (reference to HOWTOs???)

Somebody with experience from the above devices: please confirm my
guessing above or send info on how they can be used.

12.3.3 CD-R and CD-RW

CD writers can be used as backup devices. Writable CD media is either writable
exactly once (CD-R), or erasable and rewritable (CD-RW). CD-R disks can be
read on ordinary CD drives, but CD-RW disks need Multi-Read capability from the
reader. This is good to know if you plan to read the CD back on some other
computer.

Assuming the CD writer is installed and configured correctly, and you have the
necessary software to write to the CD, taking backups is best done with the
first method in the floppy disk example above, i.e. creating a Linux file
system on the CD. Since the CD is a disk, i.e. a random access device, using
it this way is easy. Just mount it and copy files or whole directory trees
there.

Restoring is also straightforward, since the CD can be examined with ordinary
file system commands like ls, and it is easy to copy a single file
back. You can also compare the files in the backup to files in hard disk with
diff for example.

Problems with CD-R are their write once -nature. They need to be written all
in one go, and then closed. After closing, it is not possible to modify the
CD, so if there is something wrong there it has to be thrown away.

CD-RW can be erased, but my understanding is the whole disk must be erased. In
addition, looks like formatting a CD-RW takes about one hour.

Looks like the program to write CD's is X-CD-Roast, available as Debian
GNU/Linux package xcdroast. More information from X-CD-Roast
Webpage

12.3.4 Tapes

Tape drives are popular backup devices. The media is relatively cheap per
gigabyte, and tape capacities go up to several tens of gigabytes. On the other
hand, the tape drives may be expensive and write speeds slower than disks.

Tape drives with SCSI connector should work with Linux. So called floppy tape
drives that connect to the floppy disk interface may work if the ftape driver
supports the particular model.

12.4 Backup methods and software

Backup methods include simply copying files to another media, using dd, tar or
similar program to create an archive and using special backup programs.

12.4.1 Network backups

In an enterprise environment there may be a backup server running some network
backup software. If there are GNU/Linux clients available for that software,
its possible to install them, configure the client machine on the backup server
and start taking backups over the network. This is a low cost solution. If
the backup server is already there, GNU/Linux clients are sometimes free to
download.

12.4.1.1 Installing EMC NetWorker Client

EMC
NetWorker is a backup system formerly known as Legato. EMC supplies
NetWorker Client for GNU/Linux, but with almost no technical support. The
applications are available in RPM binary packages from the Legato FTP Site. The
tarball will uncompress to several RPM packages. A system that will only send
files to the backup server will need lgtoclnt-7.3-1.i686.rpm and
lgtoman-7.3-1.i686.rpm.

The client package will allow the backup server to connect to your system and
to request the files that need to be backup to. The configuration is almost
all done on the backup server. The client system only need to know which
backup server to allow. Your contract should allow you install the client
packages on your system as long as you have the server licence but please check
with your EMC representative if you have doubts.

To install these packages on a Debian GNU/Linux system, they must be converted
to Debian .deb format. The files supplied by Legato are relocatable
.rpm files, and alien version 6.27 and smaller can
not correctly convert these. Use alien version 6.28 or later.

The conversion from RPMs is not perfect and you will have to perform a few
steps by hand to have everything running.

12.4.1.2 Setting up and Configuring

To get backups with Legato, there must be nsrexecd running on the
backup client host. This process communicates with the Legato server. On a
client only system, other Legato prosesses are not needed,
nsrexecd starts them as needed.

You need to tell nsrexecd from which backup server it can accept
connection. This can be done from the command line or from a resource file.
The preferred way is from a resource file:

Note that NetWorker uses a non lsb compliant directory /nsr/.
There is no documented way to tell it to look in /etc/.

If you are using a firewall, you need to open the NetWorker ports. NetWorker
uses remote procedure calls based on Sun RPCs with its own portmapper. You
need to open the portmapper ports (7937:7938) and the RPC port range
(10001:10100). With shorewall you would do it by putting those rules in
/etc/shorewall/rules

Next go to the backup server, and do a test run, where nothing is actually
saved to tape, but Legato server contacts the client and probes the file
systems. This way you can check most of the functionality and can see what
Legato would do when started for real.

Now Legato should be setup up properly, and automatic backups run as configured
on the server.

12.4.2 Tar et al

Meta: Tar, dump, dd, cpio

Now for some examples.

In Debian GNU/Linux the tar program is GNU tar, which has several
extra features. Among them is support for compressing the tar file while it is
created. On the average, compression squeezes the file to about 50% of the
uncompressed size. Your mileage may vary: files that are already compressed,
like *.zip and *.gif files do not compress at all, and some file compress
especially well, C source code files can go to 25% of the uncompressed size.

The tar file /tmp/home.tgz can be copied to another
disk, or to another computer. If you do not have any backup device, but have
two computers with free disk space, take a compressed tar of the
most important files and copy the tar to the other computer.

12.4.3 Backup software

Meta: amanda, other backup software in Debian

I do not have time to study these now, contribution would be welcome.

afbackup

amanda

dds2tar

floppybackup

ftape-module

jaztool

kbackup

mirrordir

tob

12.5 Types of backup

Meta: Full, incremental, differential, network, dump, level 0--9,

There are different kinds of backups, the following lists some of them:

Full

Full backup means backing up everything.

Incremental

Incremental backup means backing up everything that has changed since last full
backup.

Differential

Differential seems to be another name for incremental.

Network

Network backup usually means backing up a client to a backup server, this means
the client sends the files to the server and the server writes them to backup
medium.

Dump

Dump backups are not ordinary file by file backups. The whole disk partition
or file system is "dumped" to the backup medium as is. This means it
is also necessary to restore the whole partition or file system at one go. The
dump backup may be a disk image, which means it must be restored to a similar
disk with same disk geometry and bad blocks in same places. Watch out for
this.

Level 0 -- 9

Level 0 to 9 backups are a finer grained version of incremental backups. Level
N backup means backing up everything that has changed since a same
or lower lever backup.

Meta: Check the backup can be restored, with original file owners, permissions
and timestamps.

To be useful, you must be able to restore the backup. Very often not only the
contents of file are important, but their time stamps, permissions and owners.
Check that you can restore the backup so that all these are preserved.