Backing Up Files in NT

Mention the word backup to a typical computer user, and
you'll probably hear something like, "I never do it." The concept of
backing up files is often disregarded and poorly understood in computer circles.
Unfortunately, the consequences of not properly backing up your files can put
you out of business.

Starting with this article, the Windows NT Magazine Lab digs into
the topic of backing up files in Windows NT. Over the coming months, the Lab
will review various backup applications for NT. Last year, few NT backup
applications existed. Today, vendors provide numerous NT backup programs for
everything from protecting the data files on your workstation to enterprise
Hierarchical Storage Management (HSM).

This month, I'll concentrate on the importance of backing up your files
correctly. I'll also look at backup solutions and hardware considerations for
different network environments.

The Importance of Backups
Backing up your files is pointless unless you establish and follow a backup
strategy and take all the necessary precautions. Let me give you a real-world
example. I consulted for a small company (about 15 employees) that performed
daily tape backups of its server and copied critical files to notebook computers
using batch files. The company established a procedure to back up the latest
files from the server and take the backup tapes off site every Friday. However,
the company became lazy and did not take any tapes or the notebooks off site.
Instead, the company left the backup files on site. This approach seemed
relatively foolproof. After all, the employees were performing backups in a way
that provided redundancy in case something happened to the master files.

Despite taking these precautions, the company suffered a catastrophic loss
when a fire burned down the building. A firefighter saved the server 20 minutes
before the fire would have destroyed it--the backup tapes and notebooks were
useless because they burned in the fire. Fortunately, I was able to recover the
files from the damaged server, and the network was up and running the next day.

So what's the moral of the story? Although the company was backing up its
files, it wasn't taking all the necessary precautions.

Although seemingly insignificant, this example points out some
misconceptions about performing backups. First, never assume that your company
is following its prescribed backup policy. Compliance is an important part of
backup. You need to establish the optimal strategy for your backups and then
stick to it. The company in the example was following its backup strategy but
didn't move the tapes and notebooks off site. Considering that roughly 80
percent of all businesses that lose their database go out of business, this
company was very lucky.

Second, don't be ashamed to copy files to a hard disk (i.e., using
notebooks to maintain copies of data)--tapes are not the only medium that works.
Finally, don't assume that just because you've backed up your files, your data
is safe. Store your backups off site.

Likewise, don't rely on fault-tolerant systems as your backup systems. As
the name suggests, fault tolerance lets systems continue to function when
something goes wrong. When all fault tolerance fails, you use your backup tape
to restore the system and data. Backups are the last ditch effort to save data
and systems.

But suppose the tape is no good? A common problem with backup procedures is
the failure to verify tapes. If you expect the data on a tape to be good for
five years, you are probably in for a surprise. (An employee at Digital
Equipment lost years of email as a result of a bad tape header. Although
specialized companies can now recover such data for you, the cost is high.) How
can you tell if the data on the tapes is OK? Simply restore the tape and have as
much redundancy as you deem appropriate (i.e., if necessary, have more than one
tape with the same information). All these steps (performing backups on a
regular basis, storing your backups off site, and verifying the quality of the
data on the backup tapes) are essential to a sound backup strategy.

The Point of the Exercise
Backups not only let you circumvent computer disaster, but you can use them
with fault tolerance to rebuild a crashed server; frankly, backups can save your
job. The list of reasons why you need to back up your data includes

Catastrophic losses: Natural disasters, such as the fire I described
previously, can happen.

User-induced errors: Users can accidentally delete files or lose code
because of an improper command.

Hardware failures: Hard disks can fail, and power supplies can short out.

Vandalism and security failures: Hackers can destroy or alter files.

Software failures: Entire databases can become corrupt.

Audits: You need to produce archived data for legal purposes.

You need to take all backups seriously. File backups can be a source for
serious legal repercussions and can be subpoenaed in court with due cause. Be
careful about what you back up.

Successful backup strategies must be set as policy at the company level.
Nothing is more frustrating than establishing a backup strategy without company
support--without it, you will probably have a hard time accomplishing your
objective, and you may not receive adequate funding for proper backups.

The Right Strategy for Your Environment
When you consider your backup strategy, you need to account for the effect
on the bandwidth of your network. Many companies use hubs rather than switches
on their networks, and hubs are notorious for creating bottlenecks. If you use
Ethernet, I recommend you add switches with full-duplex capability. For a
network backbone, try using at least 100Base-T, and for large environments, I
suggest asynchronous transfer mode (ATM) or a comparable backbone.

After you have done your best to minimize the bottlenecks, you need to
decide what backup drives and devices are best for your environment. You can
define levels of backup based on network complexity. The types and sizes of
files needing backup can also affect your decision. For example, if a site is
concerned primarily with programming, backing up many small-sized files can be a
serious issue. Many backup applications do not handle small-sized files
efficiently. Other businesses may require long-term storage and retrieval of
files. In these environments, backup applications must maintain large databases
of files and information for the user with transparent retrieval of data (i.e.,
the system must be able to let you retrieve data even if the user isn't
present). Let's look at backup solutions for five distinct environments.

Standalone workstation. In a standalone-workstation
environment, the user needs to back up the files on a local machine. For such an
environment, NT's native Backup program is usually sufficient. NT Backup lacks
some of the features of third-party backup applications but is sufficient for
desktop backup (NT Backup is not for network backups because it has problems
backing up remote Registries). Simply copying files to a second hard disk can
also work well.

Workgroup. In a workgroup environment, you might have to
back up files for small groups or single systems. Most users attempt to use NT
Backup for workgroups. Unfortunately, NT Backup will not copy remote Registries,
and the batch files that allow remote backup typically expose usernames and
passwords. Therefore, you need to consider using different backup applications.
A workgroup environment will most likely have PCs, but it can also have
Macintoshes and Sun Microsystems SPARC stations. Depending on your workgroup
configuration, you need to incorporate a backup application that includes all
the appropriate agents for your various systems.

Department. A medium-scale client/server department
environment typically consists of fewer than 500 machines. In this environment,
the system automatically saves many files to a relatively small database (10GB
to 20GB) on strategic servers that you need to back up. One concern in this
environment is user compliance because users can save files locally instead of
saving them to a network server. You must account for these local files if they
are important to your business.

Backing up 500 machines is a long process. You need to consider the number
of backup servers, the types of backup devices, use of centralized control, and
the heterogeneity of the desktop and server operating systems. One solution for
this moderately complex environment is to use batch files to copy changed files
to a central system that you can back up. Such setups require serious attention
and a dedicated staff. Alternatively, you can insert several backup servers and
do multiple backups at once. In this situation, you want to maintain a high
backup rate--I suggest you shoot for a backup rate of more than 30MB per minute
(an easy task if you have adequate bandwidth and hardware).

Enterprise. The enterprise environment consists of a
network with large-scale systems and fewer than 1000 units. The features of this
environment are similar to the departmental network, but the database in the
enterprise environment typically approaches 100GB. In such an environment, you
must incorporate all backup resources to reduce workload. Backup becomes a
physical issue--you must back up more information in a smaller timeframe. In
this situation, you can migrate files to magneto-optical (MO) storage
towers using defined criteria. For example, you can state a rule to migrate all
.doc files not opened in the past three months to the file repository if the
files' hard disk reaches 40 percent of its capacity. This process is transparent
and is part of many HSM implementations. HSM is the process of automatically
storing data on the lowest-cost devices (magnetic disk, optical disk, and tape)
that can support the performance that the applications require. Users see the
storage as one logical unit, and file access is completely transparent. This
approach minimizes storage costs while optimizing performance. In such
circumstances, you use backup with file storage maintenance.

If you use an active tape library device, look for one that requires as
little user intervention as possible. Numerous tape devices are available; some
even use RAID configurations and are fast. These devices are expensive, and you
can justify the cost of these units in only large enterprise environments.

Enterprise-plus. The enterprise-plus environment is
large-scale (with more than 1000 systems) and multiplatform, and built on a
heterogeneous network operating system (e.g., UNIX, NT, NetWare, Mac, Windows
95, and Windows 3.x). This type of environment is hard to maintain, and
political in nature, with users wanting to control their systems rather than
operating under centralized control. Although these problems exist on
small-sized networks, the size of the user base in this environment creates
serious IS problems.

Database sizes in this environment can be almost limitless, so backup
strategies take on a new complexity. Centralized control is difficult across
distant LANs and WANs; thus, you must segment backup into logical groupings. In
this environment, you perform backups at the departmental level. In many cases,
you have to add HSM to maintain adequate system control. In addition, you have
to back up the HSM data. Standardizing on applications and desktops is optimal
in such an environment, but standardization is a company policy issue. The
complexities of these large networks make backup difficult. Every aspect of
backup can spell the difference between success and failure.

In addition to selecting the right backup strategy for your environment, an
important aspect of backup is selecting the right hardware. You need to choose
carefully. Most IS managers selecting backup hardware are limited by monetary
constraints. However, the type of device dictates the success or failure of the
backup. Without proper equipment, backup can become a nightmare because the time
available for backup is always diminishing, but the amount of data is always
dramatically increasing.

The Right Hardware for Your Environment
Many vendors offer backup drives and libraries for NT (for a list of NT
backup vendors, see "Buyer's Guide for NT Backup Solutions,").
I've looked at the following devices:

HP SureStore DAT8--DDS-2 drive

HP SureStore DAT24--DDS-3 drive

HP SureStore DAT24x6--DDS-3 autoloader

Exabyte 8700--8mm drive

Exabyte EXB-8505--8mm drive

Exabyte 210--Tape library with two 8505 drives and barcode reader

Exabyte Eliant 820--8mm drive

HP 20XT--MO jukebox (for temporary storage or HSM)

HP SureStore 40--MO jukebox

HP SureStore DLT30--DLT drive

Qualstar TLS-4000 series--8mm tape libraries

Finding a backup device suiTable for your environment doesn't have to be
hard. For standalone backups, anything goes. I prefer the SCSI tape units
because the drivers are more common than the quarter-inch cartridge (QIC--for a
list of backup-related terms, see the sidebar, "Backup Terms and Technologies,") drivers in NT. If you decide to use an IDE unit or
a special card for compression, make sure it comes with an NT driver. If you use
NT Backup, remember that it does not support software compression.

For small environments, the new 4mm and 8mm drives offer increased storage
and speed. HP's DDS-3 drives offer data transfer rates that are comparable to
digital linear tape (DLT) drives using DLT2000XL tapes. The HP SureStore DAT24x6
autoloader offers up to 144GB of storage. These devices are ideal for small- and
medium-sized networks. Old drives, such as the Exabyte EXB-8505 and the HP
SureStore DAT8, still work well but are not as cost effective as the new drives
because they're slow. If you choose a jukebox, remember that you must still
unmount and expel the backup tapes and store them off site or in protected
(fireproof) and cooled tape safes.

Large environments require specific hardware. Large library add-ons are
available for such devices as DLT drives that use DLT7000 tapes. For example,
some companies run six DLT units that share 100 tapes. The backup rate in such
environments is amazing. These systems are often limited only by the lack of
adequate bandwidth on the controlling computer. Each jukebox requires several
SCSI IDs or logical unit numbers (LUNs--more than one device per SCSI ID). You
need an ID for the robotics arm and one for each tape drive. The HP SureStore
DAT24x6 uses logical unit assignments SCSI ID# LUN 0 and SCSI ID# LUN 1.

Another strategy is to place backup devices in strategic sites throughout
the environment and control them through a centralized staging area. (Most
backup applications are adopting this type of approach.) The backup device you
choose depends on the type of data, whether users are responsible for backing up
their machines, and the size of the backups.

Choosing tape libraries is an important decision in a large network
environment. I have extensive experience with the Exabyte 210 and the Qualstar
TLS-4000 series. The Exabyte 210 is an industry standard. Unfortunately, you
must use proprietary drives in the Exabyte 210 library. In addition, the Exabyte
210 uses belts and gears to move the robotics that insert and remove tapes. This
process is tedious, and any inventory takes considerable time (e.g., 5 minutes).
Likewise, you cannot open the case to remove or add tapes or to look at tape
labels, without taking the unit offline. Every time you open the case, the
system has to re-inventory the tapes. Despite these limitations, the Exabyte 210
runs well when you set it up properly.

The Qualstar TLS-4000 series differs from the Exabyte 210. The TLS-4000
libraries have a slot (I/O port) in the front that lets you insert tapes without
opening the unit (this feature is also available on some Exabyte units). Of more
significance is the design of the unit. The robotics use a lead-screw mechanism
to move tapes. This method is much faster than belt and gear units. In addition,
the TLS-4000 libraries have inventory sentries that minimize offline time. You
can open the door on these libraries to read tape labels without having to reset
the tape inventory.

The TLS-4000 series can use any standard tape drive that Qualstar has
qualified. This feature simplifies replacing standard drives in the libraries.
Finally, the TLS-4000 libraries have non-volatile RAM that stores system data
when you lose power. For environments requiring large tape libraries, the
Qualstar units deserve serious attention because of their device support from
4mm to 8mm. In my experience, the Qualstar TLS-4000 series libraries are more
amenable to enterprise usage than comparable tape libraries such as the Exabyte
210 (the Exabyte 210 is as easy but more time consuming to maintain).

The use of MO storage towers is essential in large environments because
these towers are bigger and faster than traditional tape backup devices. The
applications that control these devices can dump a lot of data on these units,
making them ideal for data repositories and HSM. The main advantage of these
devices is their retrieval speed over traditional tape backup devices. However,
NT does not natively handle MO devices gracefully. An application has to control
the device as a service (or similar) to make the device function as a jukebox.
You can expect to see backup applications use more and more jukeboxes for
storage.

Many vendors continue to develop other types of devices that might provide
speed in critical situations. These devices include RAID tape configurations and
tape and drive systems. With the RAID systems, the hardware in the unit stripes
the tapes. With the tape and drive systems, the hardware backs up the data to a
hard disk and then dumps the data offline onto a tape. This tape and drive
approach is much faster than traditional tape-based backups.

Finally, hard disks continue to decrease in price, and many companies are
forgoing tape backups and simply copying their files to additional hard disks.
Although this strategy is suiTable in most cases, it does not fulfill the needs
of a proper backup (i.e., most hard disks aren't readily removable). Copying or
backing up files to removable drives is also becoming popular, particularly
among end-users. How well the removables will function at the enterprise level
is hard to tell.

In the coming months, the Lab will examine some backup and HSM applications
that are readily available for NT. We'll start by looking at high-end
applications and then address other solutions.