What happens when networks fail?

Last month, the computer systems of a major retail bank in England went on the fritz. For a few days, their ATMs didn't work, payments were frozen, and customers couldn't make transactions. The headlines, like the howls of protest, were loud.

Related News/Archive

It was but a small harbinger of a big problem coming our way. Instead of a computer system in one bank in one country failing for a few days, suppose it had been the electric grid for all of North America? Or communications for the global air traffic control system, down for a week or more? Or an entire continent without the Internet?

The basic systems on which modern society relies include food supply, transportation, water supply and quality, power generation and distribution, banking and financial systems, and communications. In most developed countries, there is a reasonable level of oversight of these systems, although their technological complexity has in some cases outrun the ability of the government agencies that supervise them.

Recall how near we came to global financial collapse in 2008 as a result of the failure of one investment bank. Lehman Brothers was overleveraged and ready for the morgue; but there was no monitoring capacity that could look at all the computerized connections Lehman had with other players and insulate the rest of the financial system.

At the international level, the stewardship of these complex systems relies on voluntary, cooperative relationships that do not always have clear rules and procedures. The system of worldwide weather reporting and forecasting has strict, well-observed rules and is transparent to its participants. But the global oil market is opaque, lacks clear rules, and is subject to manipulation. The absence of a transparent market and the lack of a tough international monitor to police the computer systems that support that market is an invitation to failure or piracy or both.

Once you get outside the national security field, where there is a full-scale shadow arms race in cyber capabilities, efforts to enforce standards for safety and reliability are patchy.

The road forward is difficult because — as with climate change — it requires that countries act together. Yet there is little alternative. Major nations should require independent safety and reliability audits of crucial systems as a condition of connecting to global networks. And if technological disaster insurance were required of corporations, everyone would want to drive premiums down by insisting on good data protection and backup. Rigorous planning for the inevitable failures should be required of all.

Complex, highly centralized systems are likelier to produce catastrophic failures than simpler, smaller, more dispersed systems. And when these centralized systems are imperfectly understood, then fragility becomes a prominent and dangerous characteristic. Today we are coasting without fully taking stock of how severe the risk is — and without sufficient attention to what we'd do in a crisis. Think of your own family. How well would you survive without access to cash, electricity or shops for a couple of days or weeks?

Making predictions is usually foolhardy. But I will make one that is both safe and, hopefully, useful: In the next five years, we'll see a significant failure of one of these global systems. If we're lucky, it will be contained enough to avoid widespread chaos — but dramatic enough to set us scrambling to shape up the management, transparency and security of the crucial systems on which we all depend.