This chapter is from the book

This chapter is from the book

VIRUSES AND OTHER MALICIOUS CODE

By themselves, programs are seldom security threats. The programs operate on
data, taking action only when data and state changes trigger it. Much of the
work done by a program is invisible to users, so they are not likely to be aware
of any malicious activity. For instance, when was the last time you saw a bit?
Do you know in what form a document file is stored? If you know a document
resides somewhere on a disk, can you find it? Can you tell if a game program
does anything in addition to its expected interaction with you? Which files are
modified by a word processor when you create a document? Most users cannot
answer these questions. However, since computer data are not usually seen
directly by users, malicious people can make programs serve as vehicles to
access and change data and other programs. Let us look at the possible effects
of malicious code and then examine in detail several kinds of programs that can
be used for interception or modification of data.

Why Worry About Malicious Code?

None of us likes the unexpected, especially in our programs. Malicious code
behaves in unexpected ways, thanks to a malicious programmer's intention.
We think of the malicious code as lurking inside our system: all or some of a
program that we are running or even a nasty part of a separate program that
somehow attaches itself to another (good) program.

Sidebar 3-3 Nonmalicious Flaws Cause Failures

In 1989 Crocker and Bernstein [CRO89] studied the root causes of the known
catastrophic failures of what was then called the ARPANET, the predecessor of
today's Internet. From its initial deployment in 1969 to 1989, the authors
found 17 flaws that either did cause or could have caused catastrophic failure
of the network. They use "catastrophic failure" to mean a situation
that causes the entire network or a significant portion of it to fail to deliver
network service.

The ARPANET was the first network of its sort, in which data are communicated
as independent blocks (called "packets") that can be sent along
different network routes and are reassembled at the destination. As might be
expected, faults in the novel algorithms for delivery and reassembly were the
source of several failures. Hardware failures were also significant. But as the
network grew from its initial three nodes to dozens and hundreds, these problems
were identified and fixed.

More than ten years after the network was born, three interesting
nonmalicious flaws appeared. The initial implementation had fixed sizes and
positions of the code and data. In 1986, a piece of code was loaded into memory
in a way that overlapped a piece of security code. Only one critical node had
that code configuration, and so only that one node would fail, which made it
difficult to determine the cause of the failure.

In 1987, new code caused Sun computers connected to the network to fail to
communicate. The first explanation was that the developers of the new Sun code
had written the system to function as other manufacturers' code did, not
necessarily as the specification dictated. It was later found that the
developers had optimized the code incorrectly, leaving out some states the
system could reach. But the first explanationdesigning to practice, not to
specificationis a common failing.

The last reported failure occurred in 1988. When the system was designed in
1969, developers specified that the number of connections to a subnetwork, and
consequently the number of entries in a table of connections, was limited to
347, based on analysis of the expected topology. After 20 years, people had
forgotten the (undocumented) limit, and a 348th connection was added, which
caused the table to overflow and the system to fail. But the system derived this
table gradually by communicating with neighboring nodes. So when any node's
table reached 348 entries, it crashed, and when restarted it started building
its table anew. Thus, nodes throughout the system would crash seemingly randomly
after running perfectly well for a while (with unfull tables).

None of these flaws were malicious nor could they have been exploited by a
malicious attacker to cause a failure. But they show the importance of the
analysis, design, documentation, and maintenance steps in development of a
large, long-lived system.

How can such a situation arise? When you last installed a major software
package, such as a word processor, a statistical package, or a plug-in from the
Internet, you ran one command, typically called INSTALL or SETUP. From there,
the installation program took control, creating some files, writing in other
files, deleting data and files, and perhaps renaming a few that it would change.
A few minutes and a quite a few disk accesses later, you had plenty of new code
and data, all set up for you with a minimum of human intervention. Other than
the general descriptions on the box, in the documentation files, or on the web
pages, you had absolutely no idea exactly what "gifts" you had
received. You hoped all you received was good, and it probably was. The same
uncertainty exists when you unknowingly download an application, such as a Java
applet or an ActiveX control, while viewing a web site. Thousands or even
millions of bytes of programs and data are transferred, and hundreds of
modifications may be made to your existing files, all occurring without your
explicit consent or knowledge.

Malicious Code Can Do Much (Harm)

Malicious code can do anything any other program can, such as writing a
message on a computer screen, stopping a running program, generating a sound, or
erasing a stored file. Or malicious code can do nothing at all right now; it can
be planted to lie dormant, undetected, until some event triggers the code to
act. The trigger can be a time or date, an interval (for example, after 30
minutes), an event (for example, when a particular program is executed), a
condition (for example, when communication occurs on a modem), a count (for
example, the fifth time something happens), some combination of these, or a
random situation. In fact, malicious code can do different things each time, or
nothing most of the time with something dramatic on occasion. In general,
malicious code can act with all the predictability of a two-year-old child: We
know in general what two-year-olds do, we may even know what a specific
two-year-old often does in certain situations, but two-year-olds have an amazing
capacity to do the unexpected.

Malicious code runs under the user's authority. Thus, malicious code can
touch everything the user can touch, and in the same ways. Users typically have
complete control over their own program code and data files; they can read,
write, modify, append, and even delete them. And well they should. But malicious
code can do the same, without the user's permission or even knowledge.

Malicious Code Has Been Around a Long Time

The popular literature and press continue to highlight the effects of
malicious code as if it were a relatively recent phenomenon. It is not. Cohen
[COH84] is sometimes credited with the discovery of viruses, but in fact Cohen
gave a name to a phenomenon known long before. For example, Thompson, in his
1984 Turing Award lecture, "Reflections on Trusting Trust" [THO84],
described code that can be passed by a compiler. In that lecture, he refers to
an earlier Air Force document, the Multics security evaluation [KAR74, KAR02].
In fact, references to virus behavior go back at least to 1970. Ware's 1970
study (publicly released in 1979 [WAR79]) and Anderson's planning study for
the U.S. Air Force [AND72] (to which Schell also refers) still accurately
describe threats, vulnerabilities, and program security flaws, especially
intentional ones. What is new about malicious code is the number of
distinct instances and copies that have appeared.

So malicious code is still around, and its effects are more pervasive. It is
important for us to learn what it looks like and how it works, so that we can
take steps to prevent it from doing damage or at least mediate its effects. How
can malicious code take control of a system? How can it lodge in a system? How
does malicious code spread? How can it be recognized? How can it be detected?
How can it be stopped? How can it be prevented? We address these questions in
the following sections.

Kinds of Malicious Code

Malicious code or a rogue program is the general name for
unanticipated or undesired effects in programs or program parts, caused by an
agent intent on damage. This definition eliminates unintentional errors,
although they can also have a serious negative effect. This definition also
excludes coincidence, in which two benign programs combine for a negative
effect. The agent is the writer of the program or the person who causes
its distribution. By this definition, most faults found in software inspections,
reviews, and testing do not qualify as malicious code, because we think of them
as unintentional. However, keep in mind as you read this chapter that
unintentional faults can in fact invoke the same responses as intentional
malevolence; a benign cause can still lead to a disastrous effect.

You are likely to have been affected by a virus at one time or another,
either because your computer was infected by one or because you could not access
an infected system while its administrators were cleaning up the mess one made.
In fact, your virus might actually have been a worm: The terminology of
malicious code is sometimes used imprecisely. A virus is a program that
can pass on malicious code to other nonmalicious programs by modifying them. The
term "virus" was coined because the affected program acts like a
biological virus: It infects other healthy subjects by attaching itself to the
program and either destroying it or coexisting with it. Because viruses are
insidious, we cannot assume that a clean program yesterday is still clean today.
Moreover, a good program can be modified to include a copy of the virus program,
so the infected good program itself begins to act as a virus, infecting other
programs. The infection usually spreads at a geometric rate, eventually
overtaking an entire computing system and spreading to all other connected
systems.

A virus can be either transient or resident. A transient virus has a
life that depends on the life of its host; the virus runs when its attached
program executes and terminates when its attached program ends. (During its
execution, the transient virus may have spread its infection to other programs.)
A resident virus locates itself in memory; then it can remain active or
be activated as a stand-alone program, even after its attached program ends.

A Trojan horse is malicious code that, in addition to its primary
effect, has a second, nonobvious malicious effect.1 As an example of
a computer Trojan horse,

A logic bomb is a class of malicious code that "detonates"
or goes off when a specified condition occurs. A time bomb is a logic
bomb whose trigger is a time or date.

A trapdoor or backdoor is a feature in a program by which
someone can access the program other than by the obvious, direct call, perhaps
with special privileges. For instance, an automated bank teller program might
allow anyone entering the number 990099 on the keypad to process the log of
everyone's transactions at that machine. In this example, the trapdoor
could be intentional, for maintenance purposes, or it could be an illicit way
for the implementer to wipe out any record of a crime.

A worm is a program that spreads copies of itself through a network.
The primary difference between a worm and a virus is that a worm operates
through networks, and a virus can spread through any medium (but usually uses
copied program or data files). Additionally, the worm spreads copies of itself
as a stand-alone program, whereas the virus spreads copies of itself as a
program that attaches to or embeds in other programs.

White et al. [WHI89] also define a rabbit as a virus or worm
that self-replicates without bound, with the intention of exhausting some
computing resource. A rabbit might create copies of itself and store them on
disk, in an effort to completely fill the disk, for example.

These definitions match current careful usage. The distinctions among these
terms are small, and often the terms are confused, especially in the popular
press. The term "virus" is often used to refer to any piece of
malicious code. Furthermore, two or more forms of malicious code can be combined
to produce a third kind of problem. For instance, a virus can be a time bomb if
the viral code that is spreading will trigger an event after a period of time
has passed. The kinds of malicious code are summarized in Table 3-1.

TABLE 3-1 Types of Malicious Code.

Code Type

Characteristics

Virus

Attaches itself to program and propagates copies of itself to
other programs

Trojan horse

Contains unexpected, additional functionality

Logic bomb

Triggers action when condition occurs

Time bomb

Triggers action when specified time occurs

Trapdoor

Allows unauthorized access to functionality

Worm

Propagates copies of itself through a network

Rabbit

Replicates itself without limit to exhaust
resource

Because "virus" is the popular name given to all forms
of malicious code and because fuzzy lines exist between different kinds of
malicious code, we will not be too restrictive in the following discussion. We
want to look at how malicious code spreads, how it is activated, and what effect
it can have. A virus is a convenient term for mobile malicious code, and so in
the following sections we use the term "virus" almost exclusively. The
points made apply also to other forms of malicious code.

How Viruses Attach

A printed copy of a virus does nothing and threatens no one. Even executable
virus code sitting on a disk does nothing. What triggers a virus to start
replicating? For a virus to do its malicious work and spread itself, it must be
activated by being executed. Fortunately for virus writers, but unfortunately
for the rest of us, there are many ways to ensure that programs will be executed
on a running computer.

For example, recall the SETUP program that you initiate on your computer. It
may call dozens or hundreds of other programs, some on the distribution medium,
some already residing on the computer, some in memory. If any one of these
programs contains a virus, the virus code could be activated. Let us see how.
Suppose the virus code were in a program on the distribution medium, such as a
CD; when executed, the virus could install itself on a permanent storage medium
(typically, a hard disk), and also in any and all executing programs in memory.
Human intervention is necessary to start the process; a human being puts the
virus on the distribution medium, and perhaps another initiates the execution of
the program to which the virus is attached. (It is possible for execution to
occur without human intervention, though, such as when execution is triggered by
a date or the passage of a certain amount of time.) After that, no human
intervention is needed; the virus can spread by itself.

A more common means of virus activation is as an attachment to an e-mail
message. In this attack, the virus writer tries to convince the victim (the
recipient of an e-mail message) to open the attachment. Once the viral
attachment is opened, the activated virus can do its work. Some modern e-mail
handlers, in a drive to "help" the receiver (victim), will
automatically open attachments as soon as the receiver opens the body of the
e-mail message. The virus can be executable code embedded in an executable
attachment, but other types of files are equally dangerous. For example, objects
such as graphics or photo images can contain code to be executed by an editor,
so they can be transmission agents for viruses. In general, it is safer to force
users to open files on their own rather than automatically; it is a bad idea for
programs to perform potentially security-relevant actions without a user's
consent.

Appended Viruses

A program virus attaches itself to a program; then, whenever the program is
run, the virus is activated. This kind of attachment is usually easy to program.

In the simplest case, a virus inserts a copy of itself into the executable
program file before the first executable instruction. Then, all the virus
instructions execute first; after the last virus instruction, control flows
naturally to what used to be the first program instruction. Such a situation is
shown in Figure 3-4.

This kind of attachment is simple and usually effective. The virus writer
does not need to know anything about the program to which the virus will attach,
and often the attached program simply serves as a carrier for the virus. The
virus performs its task and then transfers to the original program. Typically,
the user is unaware of the effect of the virus if the original program still
does all that it used to. Most viruses attach in this manner.

Viruses That Surround a Program

An alternative to the attachment is a virus that runs the original program
but has control before and after its execution. For example, a virus writer
might want to prevent the virus from being detected. If the virus is stored on
disk, its presence will be given away by its file name, or its size will affect
the amount of space used on the disk. The virus writer might arrange for the
virus to attach itself to the program that constructs the listing of files on
the disk. If the virus regains control after the listing program has generated
the listing but before the listing is displayed or printed, the virus could
eliminate its entry from the listing and falsify space counts so that it appears
not to exist. A surrounding virus is shown in Figure 3-5.

Integrated Viruses and Replacements

A third situation occurs when the virus replaces some of its target,
integrating itself into the original code of the target. Such a situation is
shown in Figure 3-6. Clearly, the virus writer has to know the exact structure
of the original program to know where to insert which pieces of the virus.

Finally, the virus can replace the entire target, either mimicking the effect
of the target or ignoring the expected effect of the target and performing only
the virus effect. In this case, the user is most likely to perceive the loss of
the original program.

Document Viruses

Currently, the most popular virus type is what we call the document
virus, which is implemented within a formatted document, such as a written
document, a database, a slide presentation, or a spreadsheet. These documents
are highly structured files that contain both data (words or numbers) and
commands (such as formulas, formatting controls, links). The commands are part
of a rich programming language, including macros, variables and procedures, file
accesses, and even system calls. The writer of a document virus uses any of the
features of the programming language to perform malicious actions.

The ordinary user usually sees only the content of the document (its text or
data), so the virus writer simply includes the virus in the commands part of the
document, as in the integrated program virus.

How Viruses Gain Control

The virus (V) has to be invoked instead of the target (T). Essentially, the
virus either has to seem to be T, saying effectively "I am T" (like
some rock stars, where the target is the artiste formerly known as T) or the
virus has to push T out of the way and become a substitute for T, saying
effectively "Call me instead of T." A more blatant virus can simply
say "invoke me [you fool]." The virus can assume T's name by
replacing (or joining to) T's code in a file structure; this invocation
technique is most appropriate for ordinary programs. The virus can overwrite T
in storage (simply replacing the copy of T in storage, for example).
Alternatively, the virus can change the pointers in the file table so that the
virus is located instead of T whenever T is accessed through the file system.
These two cases are shown in Figure 3-7.

The virus can supplant T by altering the sequence that would have invoked T
to now invoke the virus V; this invocation can be used to replace parts of the
resident operating system by modifying pointers to those resident parts, such as
the table of handlers for different kinds of interrupts.

Homes for Viruses

The virus writer may find these qualities appealing in a virus:

It is hard to detect.

It is not easily destroyed or deactivated.

It spreads infection widely.

It can reinfect its home program or other programs.

It is easy to create.

It is machine independent and operating system independent.

Few viruses meet all these criteria. The virus writer chooses from these
objectives when deciding what the virus will do and where it will reside.

Just a few years ago, the challenge for the virus writer was to write code
that would be executed repeatedly so that the virus could multiply. Now,
however, one execution is enough to ensure widespread distribution. Many viruses
are transmitted by e-mail, using either of two routes. In the first case, some
virus writers generate a new e-mail message to all addresses in the
victim's address book. These new messages contain a copy of the virus so
that it propagates widely. Often the message is a brief, chatty, non-specific
message that would encourage the new recipient to open the attachment from a
friend (the first recipient). For example, the subject line or message body may
read "I thought you might enjoy this picture from our vacation." In
the second case, the virus writer can leave the infected file for the victim to
forward unknowingly. If the virus's effect is not immediately obvious, the
victim may pass the infected file unwittingly to other victims.

Let us look more closely at the issue of viral residence.

One-Time Execution

The majority of viruses today execute only once, spreading their infection
and causing their effect in that one execution. A virus often arrives as an
e-mail attachment of a document virus. It is executed just by being opened.

Boot Sector Viruses

A special case of virus attachment, but formerly a fairly popular one, is the
so-called boot sector virus. When a computer is started, control begins
with firmware that determines which hardware components are present, tests them,
and transfers control to an operating system. A given hardware platform can run
many different operating systems, so the operating system is not coded in
firmware but is instead invoked dynamically, perhaps even by a user's
choice, after the hardware test.

The operating system is software stored on disk. Code copies the operating
system from disk to memory and transfers control to it; this copying is called
the bootstrap (often boot) load because the operating system
figuratively pulls itself into memory by its bootstraps. The firmware does its
control transfer by reading a fixed number of bytes from a fixed location on the
disk (called the boot sector) to a fixed address in memory and then
jumping to that address (which will turn out to contain the first instruction of
the bootstrap loader). The bootstrap loader then reads into memory the rest of
the operating system from disk. To run a different operating system, the user
just inserts a disk with the new operating system and a bootstrap loader. When
the user reboots from this new disk, the loader there brings in and runs another
operating system. This same scheme is used for personal computers, workstations,
and large mainframes.

To allow for change, expansion, and uncertainty, hardware designers reserve a
large amount of space for the bootstrap load. The boot sector on a PC is
slightly less than 512 bytes, but since the loader will be larger than that, the
hardware designers support "chaining," in which each block of the
bootstrap is chained to (contains the disk location of) the next block. This
chaining allows big bootstraps but also simplifies the installation of a virus.
The virus writer simply breaks the chain at any point, inserts a pointer to the
virus code to be executed, and reconnects the chain after the virus has been
installed. This situation is shown in Figure 3-8.

The boot sector is an especially appealing place to house a virus. The virus
gains control very early in the boot process, before most detection tools are
active, so that it can avoid, or at least complicate, detection. The files in
the boot area are crucial parts of the operating system. Consequently, to keep
users from accidentally modifying or deleting them with disastrous results, the
operating system makes them "invisible" by not showing them as part of
a normal listing of stored files, preventing their deletion. Thus, the virus
code is not readily noticed by users.

Memory-Resident Viruses

Some parts of the operating system and most user programs execute, terminate,
and disappear, with their space in memory being available for anything executed
later. For very frequently used parts of the operating system and for a few
specialized user programs, it would take too long to reload the program each
time it was needed. Such code remains in memory and is called
"resident" code. Examples of resident code are the routine that
interprets keys pressed on the keyboard, the code that handles error conditions
that arise during a program's execution, or a program that acts like an
alarm clock, sounding a signal at a time the user determines. Resident routines
are sometimes called TSRs or "terminate and stay resident" routines.

Virus writers also like to attach viruses to resident code because the
resident code is activated many times while the machine is running. Each time
the resident code runs, the virus does too. Once activated, the virus can look
for and infect uninfected carriers. For example, after activation, a boot sector
virus might attach itself to a piece of resident code. Then, each time the virus
was activated it might check whether any removable disk in a disk drive was
infected and, if not, infect it. In this way the virus could spread its
infection to all removable disks used during the computing session.

Other Homes for Viruses

A virus that does not take up residence in one of these cozy establishments
has to fend more for itself. But that is not to say that the virus will go
homeless.

One popular home for a virus is an application program. Many applications,
such as word processors and spreadsheets, have a "macro" feature, by
which a user can record a series of commands and repeat them with one
invocation. Such programs also provide a "startup macro" that is
executed every time the application is executed. A virus writer can create a
virus macro that adds itself to the startup directives for the application. It
also then embeds a copy of itself in data files so that the infection spreads to
anyone receiving one or more of those files.

Libraries are also excellent places for malicious code to reside. Because
libraries are used by many programs, the code in them will have a broad effect.
Additionally, libraries are often shared among users and transmitted from one
user to another, a practice that spreads the infection. Finally, executing code
in a library can pass on the viral infection to other transmission media.
Compilers, loaders, linkers, runtime monitors, runtime debuggers, and even virus
control programs are good candidates for hosting viruses because they are widely
shared.

Virus Signatures

A virus cannot be completely invisible. Code must be stored somewhere, and
the code must be in memory to execute. Moreover, the virus executes in a
particular way, using certain methods to spread. Each of these characteristics
yields a telltale pattern, called a signature, that can be found by a
program that knows to look for it. The virus's signature is important for
creating a program, called a virus scanner, that can automatically detect
and, in some cases, remove viruses. The scanner searches memory and long-term
storage, monitoring execution and watching for the telltale signatures of
viruses. For example, a scanner looking for signs of the Code Red worm can look
for a pattern containing the following characters:

When the scanner recognizes a known virus's pattern, it can then block
the virus, inform the user, and deactivate or remove the virus. However, a virus
scanner is effective only if it has been kept up-to-date with the latest
information on current viruses. Side-bar 3-4 describes how viruses were the
primary security breach among companies surveyed in 2001.

Sidebar 3-4 The Viral Threat

Information Week magazine reports that viruses, worms, and Trojan
horses represented the primary method for breaching security among the 4,500
security professionals surveyed in 2001 [HUL01c]. Almost 70 percent of the
respondents noted that virus, worm, and Trojan horse attacks occurred in the 12
months before April 2001. Second were the 15 percent of attacks using denial of
service; telecommunications or unauthorized entry was responsible for 12 percent
of the attacks. (Multiple responses were allowed.) These figures represent
establishments in 42 countries throughout North America, South America, Europe,
and Asia.

Storage Patterns

Most viruses attach to programs that are stored on media such as disks. The
attached virus piece is invariant, so that the start of the virus code becomes a
detectable signature. The attached piece is always located at the same position
relative to its attached file. For example, the virus might always be at the
beginning, 400 bytes from the top, or at the bottom of the infected file. Most
likely, the virus will be at the beginning of the file, because the virus writer
wants to obtain control of execution before the bona fide code of the infected
program is in charge. In the simplest case, the virus code sits at the top of
the program, and the entire virus does its malicious duty before the normal code
is invoked. In other cases, the virus infection consists of only a handful of
instructions that point or jump to other, more detailed instructions elsewhere.
For example, the infected code may consist of condition testing and a jump or
call to a separate virus module. In either case, the code to which control is
transferred will also have a recognizable pattern. Both of these situations are
shown in Figure 3-9.

A virus may attach itself to a file, in which case the file's size
grows. Or the virus may obliterate all or part of the underlying program, in
which case the program's size does not change but the program's
functioning will be impaired. The virus writer has to choose one of these
detectable effects.

The virus scanner can use a code or checksum to detect changes to a file. It
can also look for suspicious patterns, such as a JUMP instruction as the first
instruction of a system program (in case the virus has positioned itself at the
bottom of the file but wants to be executed first, as in Figure 3-9).

Execution Patterns

A virus writer may want a virus to do several things at the same time,
namely, spread infection, avoid detection, and cause harm. These goals are shown
in Table 3-2, along with ways each goal can be addressed. Unfortunately, many of
these behaviors are perfectly normal and might otherwise go undetected. For
instance, one goal is modifying the file directory; many normal programs create
files, delete files, and write to storage media. Thus, there are no key signals
that point to the presence of a virus.

Most virus writers seek to avoid detection for themselves and their
creations. Because a disk's boot sector is not visible to normal operations
(for example, the contents of the boot sector do not show on a directory
listing), many virus writers hide their code there. A resident virus can monitor
disk accesses and fake the result of a disk operation that would show the virus
hidden in a boot sector by showing the data that should have been in the
boot sector (which the virus has moved elsewhere).

There are no limits to the harm a virus can cause. On the modest end, the
virus might do nothing; some writers create viruses just to show they can do it.
Or the virus can be relatively benign, displaying a message on the screen,
sounding the buzzer, or playing music. From there, the problems can escalate.
One virus can erase files, another an entire disk; one virus can prevent a
computer from booting, and another can prevent writing to disk. The damage is
bounded only by the creativity of the virus's author.

TABLE 3-2 Virus Effects and Causes.

Virus
Effect

How
It Is Caused

Attach
to executable program

Modify file directory

Write to executable program file

Attach
to data or control file

Modify directory

Rewrite data

Append to data

Append data to self

Remain
in memory handler address table

Intercept interrupt by modifying interrupt

Load self in nontransient memory area

Infect
disks

Intercept interrupt

Intercept operating system call (to format disk, for
example)

Modify system file

Modify ordinary executable program

Conceal
self falsify result

Intercept system calls that would reveal self and

Classify self as "hidden" file

Spread
infection

Infect boot sector

Infect systems program

Infect ordinary program

Infect data ordinary program reads to control its
execution

Prevent
deactivation deactivation

Activate before deactivating program and block

Store copy to reinfect after deactivation

Section 3.3 Viruses and Other Malicious Code

Transmission Patterns

A virus is effective only if it has some means of transmission from one
location to another. As we have already seen, viruses can travel during the boot
process, by attaching to an executable file or traveling within data files. The
travel itself occurs during execution of an already infected program. Since a
virus can execute any instructions a program can, virus travel is not confined
to any single medium or execution pattern. For example, a virus can arrive on a
diskette or from a network connection, travel during its host's execution
to a hard disk boot sector, reemerge next time the host computer is booted, and
remain in memory to infect other diskettes as they are accessed.

Polymorphic Viruses

The virus signature may be the most reliable way for a virus scanner to
identify a virus. If a particular virus always begins with the string 47F0F00E08
(in hexadecimal) and has string 00113FFF located at word 12, it is unlikely that
other programs or data files will have these exact characteristics. For longer
signatures, the probability of a correct match increases.

If the virus scanner will always look for those strings, then the clever
virus writer can cause something other than those strings to be in those
positions. For example, the virus could have two alternative but equivalent
beginning words; after being installed, the virus will choose one of the two
words for its initial word. Then, a virus scanner would have to look for both
patterns. A virus that can change its appearance is called a polymorphic
virus. (Poly means "many" and morph means
"form".) A two-form polymorphic virus can be handled easily as two
independent viruses. Therefore, the virus writer intent on preventing detection
of the virus will want either a large or an unlimited number of forms so that
the number of possible forms is too large for a virus scanner to search for.
Simply embedding a random number or string at a fixed place in the executable
version of a virus is not sufficient, because the signature of the virus is just
the constant code excluding the random part. A polymorphic virus has to randomly
reposition all parts of itself and randomly change all fixed data. Thus, instead
of containing the fixed (and therefore searchable) string "HA! INFECTED BY
A VIRUS," a polymorphic virus has to change even that pattern sometimes.

Trivially, assume a virus writer has 100 bytes of code and 50 bytes of data.
To make two virus instances different, the writer might distribute the first
version as 100 bytes of code followed by all 50 bytes of data. A second version
could be 99 bytes of code, a jump instruction, 50 bytes of data, and the last
byte of code. Other versions are 98 code bytes jumping to the last two, 97 and
three, and so forth. Just by moving pieces around the virus writer can create
enough different appearances to fool simple virus scanners. Once the scanner
writers became aware of these kinds of tricks, however, they refined their
signature definitions.

A more sophisticated polymorphic virus randomly intersperses harmless
instructions throughout its code. Examples of harmless instructions include
addition of zero to a number, movement of a data value to its own location, or a
jump to the next instruction. These "extra" instructions make it more
difficult to locate an invariant signature.

A simple variety of polymorphic virus uses encryption under various keys to
make the stored form of the virus different. These are sometimes called
encrypting viruses. This type of virus must contain three distinct parts:
a decryption key, the (encrypted) object code of the virus, and the
(unencrypted) object code of the decryption routine. For these viruses, the
decryption routine itself or a call to a decryption library routine must be in
the clear, and so that becomes the signature.

To avoid detection, not every copy of a polymorphic virus has to differ from
every other copy. If the virus changes occasionally, not every copy will match a
signature of every other copy.

The Source of Viruses

Since a virus can be rather small, its code can be "hidden" inside
other larger and more complicated programs. Two hundred lines of a virus could
be separated into one hundred packets of two lines of code and a jump each;
these one hundred packets could be easily hidden inside a compiler, a database
manager, a file manager, or some other large utility.

Virus discovery could be aided by a procedure to determine if two programs
are equivalent. However, theoretical results in computing are very discouraging
when it comes to the complexity of the equivalence problem. The general
question, "are these two programs equivalent?" is undecidable
(although that question can be answered for many specific pairs of
programs). Even ignoring the general undecidability problem, two modules may
produce subtly different results that mayor may notbe security
relevant. One may run faster, or the first may use a temporary file for work
space whereas the second performs all its computations in memory. These
differences could be benign, or they could be a marker of an infection.
Therefore, we are unlikely to develop a screening program that can separate
infected modules from uninfected ones.

Although the general is dismaying, the particular is not. If we know that a
particular virus may infect a computing system, we can check for it and detect
it if it is there. Having found the virus, however, we are left with the task of
cleansing the system of it. Removing the virus in a running system requires
being able to detect and eliminate its instances faster than it can spread.

Prevention of Virus Infection

The only way to prevent the infection of a virus is not to share executable
code with an infected source. This philosophy used to be easy to follow because
it was easy to tell if a file was executable or not. For example, on PCs, a
.exe extension was a clear sign that the file was executable. However, as
we have noted, today's files are more complex, and a seemingly
nonexecutable file may have some executable code buried deep within it. For
example, a word processor may have commands within the document file; as we
noted earlier, these commands, called macros, make it easy for the user to do
complex or repetitive things. But they are really executable code embedded in
the context of the document. Similarly, spreadsheets, presentation slides, and
other office- or business-related files can contain code or scripts that can be
executed in various waysand thereby harbor viruses. And, as we have seen,
the applications that run or use these files may try to be helpful by
automatically invoking the executable code, whether you want it run or not!
Against the principles of good security, e-mail handlers can be set to
automatically open (without performing access control) attachments or embedded
code for the recipient, so your e-mail message can have animated bears dancing
across the top.

Another approach virus writers have used is a little-known feature in the
Microsoft file design. Although a file with a .doc extension is expected
to be a Word document, in fact, the true document type is hidden in a field at
the start of the file. This convenience ostensibly helps a user who
inadvertently names a Word document with a .ppt (Power-Point) or any
other extension. In some cases, the operating system will try to open the
associated application but, if that fails, the system will switch to the
application of the hidden file type. So, the virus writer creates an executable
file, names it with an inappropriate extension, and sends it to the victim,
describing it is as a picture or a necessary code add-in or something else
desirable. The unwitting recipient opens the file and, without intending to,
executes the malicious code.

More recently, executable code has been hidden in files containing large data
sets, such as pictures or read-only documents. These bits of viral code are not
easily detected by virus scanners and certainly not by the human eye. For
example, a file containing a photograph may be highly granular; if every
sixteenth bit is part of a command string that can be executed, then the virus
is very difficult to detect.

Since you cannot always know which sources are infected, you should assume
that any outside source is infected. Fortunately, you know when you are
receiving code from an outside source; unfortunately, it is not feasible to cut
off all contact with the outside world.

In their interesting paper comparing computer virus transmission with human
disease transmission, Kephart et al. [KEP93] observe that individuals'
efforts to keep their computers free from viruses lead to communities that are
generally free from viruses because members of the community have little
(electronic) contact with the outside world. In this case, transmission is
contained not because of limited contact but because of limited contact outside
the community. Governments, for military or diplomatic secrets, often run
disconnected network communities. The trick seems to be in choosing one's
community prudently. However, as use of the Internet and the World Wide Web
increases, such separation is almost impossible to maintain.

Nevertheless, there are several techniques for building a reasonably safe
community for electronic contact, including the following:

Use only commercial software acquired from reliable, well-established
vendors. There is always a chance that you might receive a virus from a
large manufacturer with a name everyone would recognize. However, such
enterprises have significant reputations that could be seriously damaged by even
one bad incident, so they go to some degree of trouble to keep their products
virus-free and to patch any problem-causing code right away. Similarly, software
distribution companies will be careful about products they handle.

Test all new software on an isolated computer. If you must use
software from a questionable source, test the software first on a computer with
no hard disk, not connected to a network, and with the boot disk removed. Run
the software and look for unexpected behavior, even simple behavior such as
unexplained figures on the screen. Test the computer with a copy of an
up-to-date virus scanner, created before running the suspect program. Only if
the program passes these tests should it be installed on a less isolated
machine.

Open attachments only when you know them to be safe. What
constitutes "safe" is up to you, as you have probably already learned
in this chapter. Certainly, an attachment from an unknown source is of
questionable safety. You might also distrust an attachment from a known source
but with a peculiar message.

Make a recoverable system image and store it safely. If your
system does become infected, this clean version will let you reboot securely
because it overwrites the corrupted system files with clean copies. For this
reason, you must keep the image write-protected during reboot. Prepare this
image now, before infection; after infection it is too late. For safety, prepare
an extra copy of the safe boot image.

Make and retain backup copies of executable system files. This
way, in the event of a virus infection, you can remove infected files and
reinstall from the clean backup copies (stored in a secure, offline location, of
course).

Use virus detectors (often called virus scanners) regularly and update
them daily. Many of the virus detectors available can both detect and
eliminate infection from viruses. Several scanners are better than one, because
one may detect the viruses that others miss. Because scanners search for virus
signatures, they are constantly being revised as new viruses are discovered. New
virus signature files, or new versions of scanners, are distributed frequently;
often, you can request automatic downloads from the vendor's web site. Keep
your detector's signature file up-to-date.

Truths and Misconceptions About Viruses

Because viruses often have a dramatic impact on the computer-using community,
they are often highlighted in the press, particularly in the business section.
However, there is much misinformation in circulation about viruses. Let us
examine some of the popular claims about them.

Viruses can infect only Microsoft Windows systems.
False. Among students and office workers, PCs are popular
computers, and there may be more people writing software (and viruses) for them
than for any other kind of processor. Thus, the PC is most frequently the target
when someone decides to write a virus. However, the principles of virus
attachment and infection apply equally to other processors, including Macintosh
computers, Unix workstations, and mainframe computers. In fact, no writeable
stored-program computer is immune to possible virus attack. As we noted in
Chapter 1, this situation means that all devices containing computer
code, including automobiles, airplanes, microwave ovens, radios, televisions,
and radiation therapy machines have the potential for being infected by a
virus.

Viruses can modify "hidden" or "read only"
files. True. We may try to protect files by using two
operating system mechanisms. First, we can make a file a hidden file so that a
user or program listing all files on a storage device will not see the
file's name. Second, we can apply a read-only protection to the file so
that the user cannot change the file's contents. However, each of these
protections is applied by software, and virus software can override the native
software's protection. Moreover, software protection is layered, with the
operating system providing the most elementary protection. If a secure operating
system obtains control before a virus contaminator has executed, the
operating system can prevent contamination as long as it blocks the attacks the
virus will make.

Viruses can appear only in data files, or only in Word documents, or
only in programs. False. What are data? What is an executable
file? The distinction between these two concepts is not always clear, because a
data file can control how a program executes and even cause a program to
execute. Sometimes a data file lists steps to be taken by the program that reads
the data, and these steps can include executing a program. For example, some
applications contain a configuration file whose data are exactly such steps.
Similarly, word processing document files may contain startup commands to
execute when the document is opened; these startup commands can contain
malicious code. Although, strictly speaking, a virus can activate and spread
only when a program executes, in fact, data files are acted upon by programs.
Clever virus writers have been able to make data control files that cause
programs to do many things, including pass along copies of the virus to other
data files.

Viruses spread only on disks or only in e-mail.
False. File-sharing is often done as one user provides a copy
of a file to another user by writing the file on a transportable disk. However,
any means of electronic file transfer will work. A file can be placed in a
network's library or posted on a bulletin board. It can be attached to an
electronic mail message or made available for download from a web site. Any
mechanism for sharing filesof programs, data, documents, and so
forthcan be used to transfer a virus.

Viruses cannot remain in memory after a complete power off/power on
reboot. True. If a virus is resident in memory, the virus is
lost when the memory loses power. That is, computer memory (RAM) is volatile, so
that all contents are deleted when power is lost.2 However, viruses
written to disk certainly can remain through a reboot cycle and reappear after
the reboot. Thus, you can receive a virus infection, the virus can be written to
disk (or to network storage), you can turn the machine off and back on, and the
virus can be reactivated during the reboot. Boot sector viruses gain control
when a machine reboots (whether it is a hardware or software reboot), so a boot
sector virus may remain through a reboot cycle because it activates immediately
when a reboot has completed.

Viruses cannot infect hardware. True. Viruses can
infect only things they can modify; memory, executable files, and data are the
primary targets. If hardware contains writeable storage (so-called firmware)
that can be accessed under program control, that storage is subject to
virus attack. There have been a few

Viruses can be malevolent, benign, or benevolent. True.
Not all viruses are bad. For example, a virus might locate uninfected
programs, compress them so that they occupy less memory, and insert a copy of a
routine that decompresses the program when its execution begins. At the same
time, the virus is spreading the compression function to other programs. This
virus could substantially reduce the amount of storage required for stored
programs, possibly by up to 50 percent. However, the compression would be done
at the request of the virus, not at the request, or even knowledge, of the
program owner.

To see how viruses and other types of malicious code operate, we examine four
types of malicious code that affected many users worldwide: the Brain, the
Internet worm, the Code Red worm, and web bugs.

First Example of Malicious Code: The Brain Virus

One of the earliest viruses is also one of the most intensively studied. The
so-called Brain virus was given its name because it changes the label of any
disk it attacks to the word "BRAIN." This particular virus, believed
to have originated in Pakistan, attacks PCs running a Microsoft operating
system. Numerous variants have been produced; because of the number of variants,
people believe that the source code of the virus was released to the underground
virus community.

What It Does

The Brain, like all viruses, seeks to pass on its infection. This virus first
locates itself in upper memory and then executes a system call to reset the
upper memory bound below itself, so that it is not disturbed as it works. It
traps interrupt number 19 (disk read) by resetting the interrupt address table
to point to it and then sets the address for interrupt number 6 (unused) to the
former address of the interrupt 19. In this way, the virus screens disk read
calls, handling any that would read the boot sector (passing back the original
boot contents that were moved to one of the bad sectors); other disk calls go to
the normal disk read handler, through interrupt 6.

The Brain virus appears to have no effect other than passing its infection,
as if it were an experiment or a proof of concept. However, variants of the
virus erase disks or destroy the file allocation table (the table that shows
which files are where on a storage medium).

How It Spreads

The Brain virus positions itself in the boot sector and in six other sectors
of the disk. One of the six sectors will contain the original boot code, moved
there from the original boot sector, while two others contain the remaining code
of the virus. The remaining three sectors contain a duplicate of the others. The
virus marks these six sectors "faulty" so that the operating system
will not try to use them. (With low-level calls, you can force the disk drive to
read from what the operating system has marked as bad sectors.) The virus allows
the boot process to continue.

Once established in memory, the virus intercepts disk read requests for the
disk drive under attack. With each read, the virus reads the disk boot sector
and inspects the fifth and sixth bytes for the hexadecimal value 1234 (its
signature). If it finds that value, it concludes the disk is infected; if not,
it infects the disk as described in the previous paragraph.

What Was Learned

This virus uses some of the standard tricks of viruses, such as hiding in the
boot sector, and intercepting and screening interrupts. The virus is almost a
prototype for later efforts. In fact, many other virus writers seem to have
patterned their work on this basic virus. Thus, one could say it was a useful
learning tool for the virus writer community.

Sadly, its infection did not raise public consciousness of viruses, other
than a certain amount of fear and misunderstanding. Subsequent viruses, such as
the Lehigh virus that swept through the computers of Lehigh University, the nVIR
viruses that sprang from prototype code posted on bulletin boards, and the
Scores virus that was first found at NASA in Washington D.C. circulated more
widely and with greater effect. Fortunately, most viruses seen to date have a
modest effect, such as displaying a message or emitting a sound. That is,
however, a matter of luck, since the writers who could put together the simpler
viruses obviously had all the talent and knowledge to make much more malevolent
viruses.

There is no general cure for viruses. Virus scanners are effective against
today's known viruses and general patterns of infection, but they cannot
counter tomorrow's variant. The only sure prevention is complete isolation
from outside contamination, which is not feasible; in fact, you may even get a
virus from the software applications you buy from reputable vendors.

Another Example: The Internet Worm

On the evening of 2 November 1988, a worm was released to the Internet,3 causing serious damage to the network. Not only were many systems
infected, but when word of the problem spread, many more uninfected systems
severed their network connections to prevent themselves from getting infected.
Gene Spafford and his team at Purdue University [SPA89] and Mark Eichen and Jon
Rochlis at M.I.T [EIC89] studied the worm extensively.

The perpetrator was Robert T. Morris, Jr., a graduate student at Cornell
University who created and released the worm. He was convicted in 1990 of
violating the 1986 Computer Fraud and Abuse Act, section 1030 of U.S. Code Title
18. He received a fine of $10,000, a three-year suspended jail sentence, and was
required to perform 400 hours of community service.

What It Did

Judging from its code, Morris programmed the Internet worm to accomplish
three main objectives:

determine to where it could spread

spread its infection

remain undiscovered and undiscoverable

What Effect It Had

The worm's primary effect was resource exhaustion. Its source code
indicated that the worm was supposed to check whether a target host was already
infected; if so, the worm would negotiate so that either the existing infection
or the new infector would terminate. However, because of a supposed flaw in the
code, many new copies did not terminate. As a result, an infected machine soon
became burdened with many copies of the worm, all busily attempting to spread
the infection. Thus, the primary observable effect was serious degradation in
performance of affected machines.

A second-order effect was the disconnection of many systems from the
Internet. System administrators tried to sever their connection with the
Internet, either because their machines were already infected and the system
administrators wanted to keep the worm's processes from looking for sites
to which to spread or because their machines were not yet infected and the staff
wanted to avoid having them become so.

The disconnection led to a third-order effect: isolation and inability to
perform necessary work. Disconnected systems could not communicate with other
systems to carry on the normal research, collaboration, business, or information
exchange users expected. System administrators on disconnected systems could not
use the network to exchange information with their counterparts at other
installations, so status and containment or recovery information was
unavailable.

The worm caused an estimated 6,000 installations to shut down or disconnect
from the Internet. In total, several thousand systems were disconnected for
several days, and several hundred of these systems were closed to users for a
day or more while they were disconnected. Estimates of the cost of damage range
from $100,000 to $97 million.

How It Worked

The worm exploited several known flaws and configuration failures of Berkeley
version 4 of the Unix operating system. It accomplishedor had code that
appeared to try to accomplishits three objectives.

Where to spread. The worm had three techniques for locating potential
machines to victimize. It first tried to find user accounts to invade on the
target machine. In parallel, the worm tried to exploit a bug in the finger
program and then to use a trapdoor in the sendmail mail handler.
All three of these security flaws were well known in the general Unix
community.

The first security flaw was a joint user and system error, in which the worm
tried guessing passwords and succeeded when it found one. The Unix password file
is stored in encrypted form, but the ciphertext in the file is readable by
anyone. (This visibility is the system error.) The worm encrypted various
popular passwords and compared their ciphertext against the ciphertext of the
stored password file. The worm tried the account name, the owner's name,
and a short list of 432 common passwords (such as "guest,"
"password," "help," "coffee," "coke,"
"aaa"). If none of these succeeded, the worm used the dictionary file
stored on the system for use by application spelling checkers. (Choosing a
recognizable password is the user error.) When it got a match, the worm could
log in to the corresponding account by presenting the plaintext password. Then,
as a user, the worm could look for other machines to which the user could obtain
access. (See the article by Robert T. Morris, Sr. and Ken Thompson [MOR79] on
selection of good passwords, published a decade before the worm.) The second
flaw concerned fingerd, the program that runs continuously to respond to
other computers' requests for information about system users. The security
flaw involved causing the input buffer to overflow, spilling into the return
address stack. Thus, when the finger call terminated, fingerd
executed instructions that had been pushed there as another part of the
buffer overflow, causing the worm to be connected to a remote shell.

The third flaw involved a trapdoor in the sendmail program.
Ordinarily, this program runs in the background, awaiting signals from others
wanting to send mail to the system. When it receives such a signal, sendmail
gets a destination address, which it verifies, and then begins a dialog to
receive the message. However, when running in debugging mode, the worm caused
sendmail to receive and execute a command string instead of the
destination address.

Spread infection. Having found a suitable target machine, the worm
would use one of these three methods to send a bootstrap loader to the target
machine. This loader consisted of 99 lines of C code to be compiled and executed
on the target machine. The bootstrap loader would then fetch the rest of the
worm from the sending host machine. There was an element of good computer
securityor stealthbuilt into the exchange between the host and the
target. When the target's bootstrap requested the rest of the worm, the
worm supplied a one-time password back to the host. Without this password, the
host would immediately break the connection to the target, presumably in an
effort to ensure against "rogue" bootstraps (ones that a real
administrator might develop to try to obtain a copy of the rest of the worm for
subsequent analysis).

Remain undiscovered and undiscoverable. The worm went to considerable
lengths to prevent its discovery once established on a host. For instance, if a
transmission error occurred while the rest of the worm was being fetched, the
loader zeroed and then deleted all code already transferred and exited.

As soon as the worm received its full code, it brought the code into memory,
encrypted it, and deleted the original copies from disk. Thus, no traces were
left on disk, and even a memory dump would not readily expose the worm's
code. The worm periodically changed its name and process identifier so that no
single name would run up a large amount of computing time.

What Was Learned

The Internet worm sent a shock wave through the Internet community, which at
that time was largely populated by academics and researchers. The affected sites
closed some of the loopholes exploited by the worm and generally tightened
security. Some users changed passwords. COPS, an automated security-checking
program, was developed to check for some of the same flaws the worm exploited.
However, as time passes and many new installations continue to join the
Internet, security analysts checking for site vulnerabilities find that many of
the same security flaws still exist. A new attack on the Internet would not
succeed on the same scale as the Internet worm, but it could still cause
significant inconvenience to many.

The Internet worm was benign in that it only spread to other systems but did
not destroy any part of them. It collected sensitive data, such as account
passwords, but it did not retain them. While acting as a user, the worm could
have deleted or overwritten files, distributed them elsewhere, or encrypted them
and held them for ransom. The next worm may not be so benign.

The worm's effects stirred several people to action. One positive
outcome from this experience was development in the United States of an
infrastructure for reporting and correcting malicious and nonmalicious code
flaws. The Internet worm occurred at about the same time that Cliff Stoll
[STO89] reported his problems in tracking an electronic intruder (and his
subsequent difficulty in finding anyone to deal with the case). The computer
community realized it needed to organize. The resulting Computer Emergency
Response Team (CERT) at Carnegie Mellon University was formed; it and similar
response centers around the world have done an excellent job of collecting and
disseminating information on malicious code attacks and their countermeasures.
System administrators now exchange information on problems and solutions.
Security comes from informed protection and action, not from ignorance and
inaction.

More Malicious Code: Code Red

Code Red appeared in the middle of 2001, to devastating effect. On July 29,
the U.S. Federal Bureau of Investigation proclaimed in a news release that
"on July 19, the Code Red worm infected more than 250,000 systems in just
nine hours ... This spread has the potential to disrupt business and personal
use of the Internet for applications such as e-commerce, e-mail and
entertainment." [BER01] Indeed, "the Code Red worm struck faster than
any other worm in Internet history," according to a research director for a
security software and services vendor. The first attack occurred on July 12;
overall, 750,000 servers were affected, including 400,000 just in the period
from August 1 to 10. [HUL01] Thus, of the 6 million web servers running code
subject to infection by Code Red, about one in eight were infected. Michael
Erbschloe, vice president of Computer Economics, Inc., estimates that Code
Red's damage will exceed $2 billion. [ERB01] Code Red was more than a worm;
it included several kinds of malicious code, and it mutated from one version to
another. Let us take a closer look at how Code Red worked.

What It Did

There are several versions of Code Red, malicious software that propagates
itself on web servers running Microsoft's Internet Information Server (IIS)
software. Code Red takes two steps: infection and propagation. To infect a
server, the worm takes advantage of a vulnerability in Microsoft's IIS. It
overflows the buffer in the dynamic link library idq.dll to reside in the
server's memory. Then, to propagate, Code Red checks IP addresses on port
80 of the PC to see if that web server is vulnerable.

What Effect It Had

The first version of Code Red was easy to spot, because it defaced web sites
with the following text:

HELLO! Welcome to http://www.worm.com ! Hacked by Chinese!

The rest of the original Code Red's activities were determined by the
date. From day 1 to 19 of the month, the worm spawned 99 threads that scanned
for other vulnerable computers, starting at the same IP address. Then, on days
20 to 27, the worm launched a distributed denial-of-service attack at the U.S.
web site, www. whitehouse.gov. A denial-of-service attack floods the site
with large numbers of messages in an attempt to slow down or stop the site
because the site is overwhelmed and cannot handle the messages. Finally, from
day 28 to the end of the month, the worm did nothing.

However, there were several variants. The second variant was discovered near
the end of July 2001. It did not deface the web site, but its propagation was
randomized and optimized to infect servers more quickly. A third variant,
discovered in early August, seemed to be a substantial rewrite of the second.
This version injected a Trojan horse in the target and modified software to
ensure that a remote attacker could execute any command on the server. The worm
also checked the year and month, so that it would automatically stop propagating
in October 2002. Finally, the worm rebooted the server after 24 or 48 hours,
wiping itself from memory but leaving the Trojan horse in place.

How It Worked

The Code Red worm looked for vulnerable personal computers running Microsoft
IIS software. Exploiting the unchecked buffer overflow, the worm crashed Windows
NT-based servers but executed code on Windows 2000 systems. The later versions
of the worm created a trapdoor on an infected server; then, the system was open
to attack by other programs or malicious users. To create the trapdoor, Code Red
copied %windir%\cmd.exe to four locations:

Code Red also included its own copy of the file explorer.exe, placing
it on the c: and d: drives so that Windows would run the malicious copy, not the
original copy. This Trojan horse first ran the original, untainted version of
explorer.exe, but it modified the system registry to disable certain
kinds of file protection and to ensure that some directories have read, write,
and execute permission. As a result, the Trojan horse had a virtual path that
could be followed even when explorer.exe was not running. The Trojan
horse continues to run in background, resetting the registry every 10 minutes;
thus, even if a system administrator notices the changes and undoes them, the
changes are applied again by the malicious code.

To propagate, the worm created 300 or 600 threads (depending on the variant)
and tried for 24 or 48 hours to spread to other machines. After that, the system
was forcibly rebooted, flushing the worm in memory but leaving the backdoor and
Trojan horse in place.

To find a target to infect, the worm's threads worked in parallel.
Although the early version of Code Red targeted www.whitehouse.gov, later
versions chose a random IP address close to the host computer's own
address. To speed its performance, the worm used a nonblocking socket so that a
slow connection would not slow down the rest of the threads as they scanned for
a connection.

What Was Learned

As of this writing, more than 6 million servers use Microsoft's IIS
software. The Code Red variant that allowed unlimited root access made Code Red
a virulent and dangerous piece of malicious code. Microsoft offered a patch to
fix the overflow problem and prevent infection by Code Red, but many
administrators neglected to apply the patch. (See Sidebar 3-5.) Some security
analysts suggested that Code Red might be "a beta test for information
warfare," meaning that its powerful combination of attacks could be a
prelude to a large-scale, intentional effort targeted at particular countries or
groups. [HUL01a] For this reason, users and developers should pay more and
careful attention to the security of their systems. Forno [FOR01] warns that
such security threats as Code Red stem from our general willingness to buy and
install code that does not meet minimal quality standards and from our
reluctance to devote resources to the large and continuing stream of patches and
corrections that flows from the vendors. As we will see in Chapter 9, this
problem is coupled with a lack of legal standing for users who experience
seriously faulty code.

Malicious Code on the Web: Web Bugs

With the web pervading the lives of average citizens everywhere, malicious
code in web pages has become a very serious problem. But sometimes the malice is
not always clear; code can be used to good or bad ends, depending on your
perspective. In this section, we look at a generic type of code called a web
bug, to see how it can affect the code in which it is embedded.

What They Do

A web bug, sometimes called a pixel tag, clear gif,
one-by-one gif, invisible gif, or beacon gif, is a hidden
image on any document that can display HTML tags, such as a

Sidebar 3-5 Is the Cure Worse Than the Disease?

These days, a typical application program such as a word processor or
spreadsheet package is sold to its user with no guarantee of quality. As
problems are discovered by users or developers, patches are made available to be
downloaded from the web and applied to the faulty system. This style of
"quality control" relies on the users and system administrators to
keep up with the history of releases and patches and to apply the patches in a
timely manner. Moreover, each patch usually assumes that earlier patches can be
applied; ignore a patch at your peril.

For example, Forno [FOR01] points out that an organization hoping to secure a
web server running Windows NT 4.0's IIS had to apply over 47 patches as
part of a service pack or available as a download from Microsoft. Such stories
suggest that it may cost more to maintain an application or system than it cost
to buy the application or system in the first place! Many organizations,
especially small businesses, lack the resources for such an effort. As a
consequence, they neglect to fix known system problems, which can then be
exploited by hackers writing malicious code.

Blair [BLA01] describes a situation shortly after the end of the Cold War
when the United States discovered that Russia was tracking its nuclear weapons
materials by using a paper-based system. That is, the materials tracking system
consisted of boxes of paper filled with paper receipts. In a gesture of
friendship, the Los Alamos National Lab donated to Russia the Microsoft software
it uses to track its own nuclear weapons materials. However, experts at the
renowned Kurchatov Institute soon discovered that over time some files become
invisible and inaccessible! In early 2000, they warned the United States. To
solve the problem, the United States told Russia to upgrade to the next version
of the Microsoft software. But the upgrade had the same problem, plus a security
flaw that would allow easy access to the database by hackers or unauthorized
parties.

imes patches themselves create new problems as they are fixing old ones. It
is well known in the software reliability community that testing and fixing
sometimes reduce reliability, rather than improve it. And with the complex
interactions between software packages, many computer system managers prefer to
follow the adage "if it ain't broke, don't fix it," meaning
that if there is no apparent failure, they would rather not risk causing one
from what seems like an unnecessary patch. So there are several ways that the
continual bug-patching approach to security may actually lead to a less secure
product than you started with.

web page, an HTML e-mail message, or even a spreadsheet. Its creator intends
the bug to be invisible, unseen by users but very useful nevertheless because it
can track the activities of a web user.

For example, if you visit the Blue Nile home page, www.bluenile.com,
the following web bug code is automatically downloaded as a one-by-one pixel
image from Avenue A, a marketing agency:

What Effect They Have

Suppose you are surfing the web and load the home page for
Commercial.com, a commercial establishment selling all kinds of
housewares on the web. If this site contains a web bug for Market.com, a
marketing and advertising firm, then the bug places a file called a cookie
on your system's hard drive. This cookie, usually containing a numeric
identifier unique to you, can be used to track your surfing habits and build a
demographic profile. In turn, that profile can be used to direct you to
retailers in whom you may be interested. For example, Commercial.com may
create a link to other sites, display a banner advertisement to attract you to
its partner sites, or offer you content customized for your needs.

How They Work

On the surface, web bugs do not seem to be malicious. They plant numeric data
but do not track personal information, such as your name and address. However,
if you purchase an item at Commercial.com, you may be asked to supply
such information. Thus, the web server can capture such things as

your computer's IP address

the kind of web browser you use

your monitor's resolution

other browser settings, such as whether you have enabled Java
technology

connection time

previous cookie values

and more.

This information can be used to track where and when you read a document,
what your buying habits are, or what your personal information may be. More
maliciously, the web bug can be cleverly used to review the web server's
log files and determine your IP addressopening your system to hacking via
the target IP address.

What Was Learned

Web bugs raise questions about privacy, and some countries are considering
legislation to protect specifically from probes by web bugs. In the meantime,
the Privacy Foundation has made available a tool called Bugnosis to locate web
bugs and bring them to a user's attention.

In addition, users can invoke commands from their web browsers to block
cookies or at least make the users aware that a cookie is about to be placed on
a system. Each option offers some inconvenience. Cookies can be useful in
recording information that is used repeatedly, such as name and address.
Requesting a warning message can mean almost continual interruption as web bugs
attempt to place cookies on your system. Another alternative is to allow cookies
but to clean them off your system periodically, either by hand or by using a
commercial product.