Why The Airline Industry Could Keep Suffering System Failures Like Delta's

Passengers wait at Hartsfield-Jackson Atlanta International Airport after a computer systems failure on Monday caused Delta to delay or cancel hundreds of flights.

Branden Camp
/
AP

Delta canceled about 530 flights on Tuesday in addition to about 1,000 canceled a day earlier after a power outage in Atlanta brought down the company's computers, grinding the airline's operation virtually to a halt.

Seth Kaplan, who follows the airline industry, asks the question on everyone's mind: "If every small business on the corner can manage to keep its website running through a cloud-based server and all those sorts of things, why can't Delta Air Lines with all its resources manage to do that?"

And in fact, Delta isn't the first airline to be downed by computer malfunction — add Southwest, United and others to the list — and unfortunately, this meltdown is unlikely to be the last.

"It's a fair criticism," says Kaplan, managing partner of Airline Weekly, an independent publication that follows the industry. But, he says, airlines aren't like other businesses.

"Because they have to worry so much about safety and security, they are constrained in ways that other businesses aren't," he says. "Delta can't just host its systems on Joe Blow's cloud server somewhere else in the way that another business might be able to do."

Kaplan says if Delta and other airlines distribute their computing to many different locations, it will make them more vulnerable to, say, hackers or terrorists. In other words, given a choice between more backup systems and more security, airlines are picking security.

The local utility Georgia Power says the failure was caused by a failed Delta "switchgear" at its Atlanta data center — that's a piece of equipment that connects Delta's computers to the power grid and to the company's backup generators, according to Bob Mann, a former airline executive who is now an aviation consultant.

He says this was a rare malfunction with a part that is usually reliable. "They had Georgia Power available at the site," Mann says. "They had their own generators and batteries available at the site. But the automated transfer switch seems to have failed in a way that allowed them to use neither of those systems."

And unfortunately, even if Delta finds a way to prevent another problem with the switchgear, Mann believes airline computer failures are likely to happen in the future.

The systems are increasingly complex. The computers are interacting with a myriad of outside systems, such as travel agents and Web apps. Plus, with the spate of mergers, various separate systems have had to consolidate.

"So the size of the networks, the number of devices on those networks continue to expand," Mann says, "and even the most reliable hardware have some probability of failure, however random and small that is. Thus, having more of them around creates greater potential that any one of them could fail."

The lesson for travelers may be: Don't check the bag with your toothbrush, in case you get stuck.