Imagine a series of events unfolding on a single day. First, 20 million U.S. smart phones stop working. Next follow outages in wireline telephone service, problems with air traffic control, disruptions to the New York Stock Exchange, and eventually severe loss of power on America's East Coast. What could cause such crippling outcomes?

You might think first they are isolated events, just coincidentally occurring on the same day. But with several things happening at once, you next start to look for common causes. Perhaps the various organizations providing these services bought some of their software from the same vendor, and the software is failing because of a shared flaw. Possibly this situation is like the Y2K problem, when people were concerned that on January 1, 2000 computer systems would crash because they used only two digits for the date (98, 99) and would fail when computer clocks rolled over the year boundary. Or maybe dependencies in one sector trigger actions that cause the initial failure to cascade into other sectors, for example:

A software defect causes disruption in mobile phone service.

Consequently, those who need to use phones revert to their wireline service, thereby overloading circuits.

Air traffic controllers in some parts of the country depend on wireline communication, so overloaded circuits lead to air traffic control problems.

Similarly, the New York Stock Exchange is severely debilitated by its brokers' inability to place and verify trades.

At the same time, the power grid experiences problems because its controllers, no longer able to exchange information by using mobile phones, shut down because of a flawed protocol.

There is yet another scenario, used by the Bipartisan Policy Center in its February 2010 Cyber ShockWave exercise: malicious computer software or malware, "planted in phones months earlier through a popular 'March Madness' basketball bracket application, disrupts mobile service for millions" [BPC10].

It is difficult—sometimes impossible—to distinguish between an accident and an attack. Consider, for example, an online gambling site that received a flood of blank incoming email messages that overwhelmed servers and slowed customer traffic to a crawl. Blank messages could easily come from a software or hardware problem: a mail handler caught in a loop with one malformed message that it dispatches over and over. Shortly thereafter, the company received email written in broken English. It told the company to wire $40,000 to ten different accounts in Eastern Europe if it wanted its computers to stay online [MCA05]. So much for the "just an accident" theory.

Are these scenarios realistic or implausible? And are cyber security exercises such as these and the ones described in Sidebar 1-1 designed to confirm our readiness (a security blanket) or exacerbate our worries (security theater)? What is the likelihood we will be able to determine the causes of these kinds of failures and then prevent or mitigate their effects?

Sidebar 1-1: Testing Cyber Security Readiness

Governments and the private sector have organized many "cyber security exercises." Although the nature of each exercise varies, the goals of such exercises are similar: to anticipate unwelcome cyber events so that prevention and mitigation plans can be made, to make both public and private officials aware of cyber security risks, and to test existing response plans for both coverage and effectiveness.

For example, in November 2010, the European Union ran its first cyber security "stress test," Cyber Europe 2010. Its objective was to "test Europe's readiness to face online threats to essential critical infrastructure used by citizens, governments and businesses." The activities involved 22 participating nations and 8 observers. Among the lessons learned:

The private sector must be involved.

Testing of pan-European preparedness measures is lacking because each member nation is still refining its national approach.

The exercise is a first step in building trust at a pan-European level. More cooperation and information exchange are needed.

Incident handling varied a lot from one nation to another because of the different roles, responsibilities, and bodies involved in the process. Some nations had difficulty understanding how similar incidents are managed in other member nations.

A new pan-European directory of contacts need not be created. The existing directories are sufficient but need to be updated and completed regularly.

Other cyber security exercises have been run around the world. The U.S. Department of Home-land Security involves both public and private sector organizations in its biannual Cyber Storm process. And the Bipartisan Policy Center engaged former U.S. government officials in real-time reaction to its simulated cyber attack. Private enterprise and business sector groups also run cyber security exercises; however, they do not usually make their results public, for fear of revealing problems to possible attackers.

No matter what your work or family responsibilities, it is important for you to understand the nature of these scenarios, make reasoned judgments about their likelihood, and take prudent actions to protect yourselves and the people, data, and things you value.

One way to develop an understanding is to imagine how you might interpret a situation and then react to it. For example, in the unfolding events from mobile phone outage to East Coast power failure, consider these roles:

You are using your mobile phone to talk with your friend, and the connection drops. You redial repeatedly but never connect. You then try to call your friend on your land line, but again there is no connection. How long does it take you to realize that the problem affects far more people than just you and your friend? Do you contact the telephone company? (And how? You cannot phone, and your Internet connection may very well depend on your telephone carrier!) By the time the power goes out, how do you know the power failure is related to your phone problems? When do you take any action? And what do you do?

You are using your mobile phone to call your stockbroker because your company's initial public offering (IPO) is scheduled for today—so your company's viability depends on the resulting stock price and the volume of sales. As you begin your conversation with the stockbroker, the connection drops. You redial repeatedly, but never connect. You then try to call your broker on the land line, but again there is no connection. How long does it take you realize that the problem affects your company? Your broker? Others? Whom do you call to report a problem? And when the power goes out, what action do you take?

You are a government official involved with air traffic control. All morning, you have heard rumors of telephone problems around the country. On your secure government line, you get a call confirming those problems and reporting widening problems with the air traffic control system. How do you determine what is wrong? To whom do you report problems? When you realize that problems with air traffic control may be dangerous to aircraft and their passengers, how do you react? Can you ground all aircraft until the sources of the problems are located and corrected?

You are a government official involved with regulating the power grid. All morning, you have heard rumors of telephone problems around the country. Your web-based reporting system begins to report sporadic power outages on the East Coast. On your secure government line, you get a call confirming those problems and reporting widening problems with the air traffic control system. How do you determine what is wrong? To whom do you report problems? When you realize that problems with the power grid may threaten the viability of the entire nation's power system, how do you react? The power grid is owned by the private sector. Does the government have authority to shut down the grid until the sources of the problems are located and corrected?

The last situation has precedents. During World War I, the U.S. government took over the railroads [WIL17] and the telephone-telegraph system by presidential proclamations:

I, Woodrow Wilson, President of the United States, ... do hereby take possession and assume control and supervision of each and every telegraph and telephone system, and every part thereof, within the jurisdiction of the United States, including all equipment thereof and appurtenances thereto whatsoever and all materials and supplies [WIL18].

During World War II, the U.S. government encouraged the automotive industry to redirect production toward jeeps, trucks, and airplane parts. The Automotive Council for War Production was formed at the end of 1941, and automobile production was suspended entirely in 1942 so that the industry's total capacity could focus on the war effort. So possible reactions to our complex scenario could indeed range from inaction to private sector coordination to government intervention. How do you determine cause and effect, severity of impact, and over what time period? The answers are important in suggesting appropriate actions.

Analyzing Computer Security will assist you in understanding the issues and choosing appropriate responses to address these challenges.

In this chapter, we examine our dependence on computers and then explore the many ways in which we are vulnerable to computer failure. Next, we introduce the key concepts of computer security, including attacks, vulnerabilities, threats, and controls. In turn, these concepts become tools for understanding the nature of computer security and our ability to build the trustworthy systems on which our lives and livelihoods depend.

How Dependent Are We on Computers?

You drive down the road and suddenly your car brakes to a stop—or accelerates uncontrollably. You try to withdraw money from your bank and find that your account is overdrawn, even though you think it should contain plenty of money. Your doctor phones to tell you a recent test showed that your usually normal vitamin D level is a fraction of what it should be. And your favorite candidate loses an election that should have been a sure victory. Should you be worried?

There may be other explanations for these events, but any of them may be the result of a computer security problem. Computers are embedded in products ranging from dogs to spaceships; computers control activities from opening doors to administering the proper dose of radiation therapy. Over the last several decades, computer usage has expanded tremendously, and our dependence on computers has increased similarly. So when something goes awry, it is reasonable to wonder if computers are the source of the problem.

But can we—and should we—depend on computers to perform these tasks? How much can we entrust to them, and how will we determine their dependability, safety, and security? These questions continue to occupy policy makers, even as engineers, scientists, and other inventors devise new ways to use computers.

From one perspective, these failures are welcome events because we learn a lot from them. Indeed, engineers are trained to deal with and learn from past failures. So engineers are well qualified to build large structures on which many of us depend. For example, consider bridges; these days, bridges seldom fail. An engineer can study stresses and strengths of materials, and design a bridge that will withstand a certain load for a certain number of years; to ensure that the bridge will last, the engineer can add a margin of safety by using thicker or stronger materials or adding more supports. You can jump up and down on a bridge, because the extra force when you land is well within the tolerance the engineer expected and planned for. When a bridge does fail, it is usually because some bridge component has been made of defective materials, design plans were not followed, or the bridge has been subjected to more strain than was anticipated (which is why some bridges have signs warning about their maximum load).

But computer software is engineered differently, and not all engineers appreciate the differences or implement software appropriately to address a wide variety of security risks. Sidebar 1-2 illustrates some of these risks.

Sidebar 1-2: Protecting Software in Automobile Control Systems

The amount of software installed in a new automobile grows larger from year to year. Most cars, especially more expensive ones, use dozens of microcontrollers to provide a variety of features aimed at enticing buyers. These digital cars use software to control individual subsystems, and then more software to connect the systems into a network.

Whitehorn-Umphres [WHI01] points out that this kind of software exhibits a major difference in thinking between hardware designers and software designers. "As hardware engineers, they [the automobile designers] assumed that, perhaps aside from bolt-on aftermarket parts, everything else is and should be a black box." But software folks have a different take: "As a software designer, I assume that all digital technologies are fair game for being played with ... it takes a special kind of personality to look at a software-enabled device and see the potential for manipulation and change—a hacker personality." That is, hardware engineers do not expect their devices to be opened and changed, but software engineers—especially security specialists—do.

As a result, the hardware-trained engineers designing and implementing automotive software see no reason to protect it from hackers. According to a paper by Koscher and other researchers from the University of Washington and University of California San Diego [KOS10], "Over a range of experiments, both in the lab and in road tests, we demonstrate the ability to adversarially control a wide range of automotive functions and completely ignore driver input—including disabling the brakes, selectively braking individual wheels on demand, stopping the engine, and so on. We find that it is possible to bypass rudimentary network security protections within the car, such as maliciously bridging between our car's two internal subnets. We also present composite attacks that leverage individual weaknesses, including an attack that embeds malicious code in a car's telematics unit and that will completely erase any evidence of its presence after a crash." Their paper presents several laboratory attacks that could have devastating effects if performed on real cars on a highway.

Koscher and colleagues observe that "the future research agenda for securing cyber-physical vehicles is not merely to consider the necessary technical mechanisms, but to also inform these designs by what is feasible practically and compatible with the interests of a broader set of stakeholders."

Security experts have long sought to inform designers and developers of security risks and countermeasures. Unfortunately, all too often the pleas of the security community are ignored in the rush to add and deliver features that will improve sales.

Like bridges, computers can fail: Some moving parts wear out, electronic hardware components stop working or, worse, work intermittently. Indeed, computers can be made to fail without even being physically touched. Failures can happen seemingly spontaneously, when unexpected situations put the system into a failing or failed state. So there are many opportunities for both benign users and malicious attackers to cause failures. Failures can be small and harmless, like a "click here" button that does nothing, or catastrophic, like a faulty program that destroys a file or even erases an entire disk. The effects of failures can be readily apparent—a screen goes blank—or stealthy and difficult to find, such as a program that covertly records every key pressed on the keyboard.

Computer security addresses all these types of failures, including the ones we cannot yet see or even anticipate. The computers we consider range from small chips to embedded devices to stand-alone computers to gangs of servers. So too do we include private networks, public networks, and the Internet. They constitute the backbone of what we do and how we do it: commerce, communication, health care, and more. So understanding failure can lead us to improvements in the way we lead our lives.

Each kind or configuration of computer has many ways of failing and being made to fail. Nevertheless, the analytic approach you will learn in this book will enable you to look at each computer system (and the applications that run on it) to determine how you can protect data, computers, networks, and ultimately yourselves.