Politics

Big Data

Over the past two weeks we have seen the following computer system crashes:

a three-hour network shutdown on 22 August that paralysed the NASDAQ stock exchange, crippled others, and caused a one-third drop in the daily total of shares traded on American exchanges;

also on 22 August, a blackout of Apple’s iCloud that lasted for 11 hours for some customers;

a trading glitch in the Goldman Sachs computer on 20 August that resulted in a large number of erroneous stock and options trades and cost the firm up to $100 million;

a shutdown of Amazon’s North American retail site on 19 August that lasted almost an hour and resulted in an estimated $2 million in lost sales;

on 16 August, a four-minute global outage of Google’s services, including email, YouTube and its core search engine, that led to an 40 percent drop in global internet traffic;

And last month, in another part of the forest, we had the director of the US National Security Agency, General Keith Alexander, admitting that he still did not know exactly which files whistle-blower Edward Snowden had downloaded and taken with him when he fled the country two months before.

Well, General Alexander didn’t exactly admit it; he just declined to say whether he knew, but that comes to the same thing. Two months after Snowden flew the coop, the NSA still doesn’t know how many more of their embarrassing secrets are out there waiting to be revealed.

This may explain something quite puzzling that happened last week. A Brazilian citizen, David Miranda was changing planes in London when he was stopped by British police under the Terrorism Act, questioned for nine hours, and then released – but the police kept his computer, two pen drives, an external hard drive, and various other electronic items.

Miranda is the partner of Guardian journalist Glenn Greenwald, who has been working on Snowden’s documents, but the police wouldn’t have gone to all that trouble just to harass him, particularly since their actions were probably illegal: all their questions were about Snowden and the NSA files, not about terrorism. And why would they even bother to confiscate Miranda’s electronics? Don’t they realise there are bound to be copies elsewhere?

It’s less puzzling if you assume that the NSA asked for the operation (of course it did), and that its goal was actually to find out just how much Snowden knows, and can prove. Maybe it found out, maybe it didn’t – but what it tells the rest of us is that the NSA is not really in control of its own data. If Snowden can take it away with him, so can others.

There are 850,000 potential “others” – Americans with top secret clearance and access to the data – and some of them will not have the same high motives as Snowden for stealing the data. In fact, the NSA even catches an average of one employee a year who has been using the system to track a lover or spouse they suspect is straying. God knows how many it doesn’t catch – but if its inability to figure out what Snowden took is any guide, probably a lot.

What the NSA has built is a system that is too big to monitor properly, let alone fully control. The system’s official purposes are bad enough, but it cannot even know the full range of illegitimate private actions that it permits. And this is not a design flaw. It is inherent in the very size of the system and the number of people who have access to it. Which brings us back to NASDAQ, Apple, Goldman Sachs et. al.

If it can be done, it will be done. Algorithms will be written for automated trading at speeds measured in fractions of a microsecond, and the competition will have to follow suit. It will become possible to store immense amounts of data in a virtual “cloud”, and the cloud will take shape. It will become theoretically possible to listen in on every conversation in the world, and the surveillance systems to do it will be built.

Every step onward increases the scale and complexity of the systems, until they are too big and complex for any one person to understand. They will run without supervision, for the most part, and when they fail (as they must from time to time) the failure will also be hard to understand. And if you give hundreds of thousands of people access to the system, your secrets will not stay secret for long.

The volume of date moving on the internet and private networks is expanding very fast at the moment – from 6 gigabytes for each person on the planet this year to 16 gigabytes per person per year by 2017 – and system design is just not keeping up. Given time, it may be possible to catch up on that front, if the rate of expansion eventually slows. But it will be much harder, maybe impossible, to build leak-proof surveillance systems.