14.3. rm and Its Dangers

Under
Unix, you use the rm command to delete files. The
command is simple enough; you just type rm
followed by a list of files. If anything, rm is
too simple. It's easy to delete more than you want,
and once something is gone, it's permanently gone.
There are a few hacks that make rm somewhat safer,
and we'll get to those momentarily. But first,
here's a quick look at some of the dangers.

To understand why it's impossible to reclaim deleted
files, you need to know a bit about how the Unix filesystem works.
The system contains a "free list,"
which is a list of disk blocks that aren't used.
When you delete a file, its directory entry (which gives it its name)
is removed. If there are no more links (Section 10.3) to the
file (i.e., if the file only had one name), its inode (Section 14.2) is added
to the list of free inodes, and its datablocks are added to the free
list.

Well, why can't you get the file back from the free
list? After all, there are DOS utilities that can reclaim deleted
files by doing something similar. Remember, though, Unix is a
multitasking operating system. Even if you think your system is a
single-user system, there are a lot of things going on
"behind your back": daemons are
writing to log files, handling network connections, processing
electronic mail, and so on. You could theoretically reclaim a file if
you could "freeze" the filesystem
the instant your file was deleted -- but that's
not possible. With Unix, everything is always active. By the time you
realize you made a mistake, your file's data blocks
may well have been reused for something else.

When you're deleting
files, it's important to use wildcards carefully.
Simple typing errors can have disastrous consequences.
Let's say you want to delete all your object
(.o) files. You want to type:

% rm *.o

But because of a nervous twitch, you add an extra space and type:

% rm * .o

It looks right, and you might not even notice the error. But before
you know it, all the files in the current directory will be gone,
irretrievably.

If you don't think this can happen to you,
here's something that actually did happen to me. At
one point, when I was a relatively new Unix user, I was working on my
company's business plan. The executives thought, so
as to be "secure," that
they'd set a business plan's
permissions so you had to be root
(Section 1.18) to modify it. (A mistake in its own
right, but that's another story.) I was using a
terminal I wasn't familiar with and accidentally
created a bunch of files with four control characters at the
beginning of their name. To get rid of these, I typed (as
root):

# rm ????*

This command took a long time to execute. When
about two-thirds of the directory was gone, I realized (with horror)
what was happening: I was deleting all files with four or more
characters in the filename.

The story got worse. They hadn't made a backup in
about five months. (By the way, this article should give you plenty
of reasons for making regular backups (Section 38.3).) By
the time I had restored the files I had deleted (a several-hour
process in itself; this was on an ancient version of Unix with a
horrible backup utility) and checked (by hand)
all the files against our printed copy of the business plan, I had
resolved to be very careful with my
rm commands.

[Some shells have safeguards that work against
Mike's first disastrous example -- but not the
second one. Automatic safeguards like these can become a crutch,
though . . . when you use another shell temporarily and
don't have them, or when you type an expression like
Mike's very destructive second example. I agree with
his simple advice: check your rm commands
carefully! -- JP]