NSA captures data from Yahoo, Google

The National Security Agency has secretly broken into the main communications links that connect Yahoo and Google data centers around the world, according to documents obtained from former NSA contractor Edward Snowden and interviews with knowledgeable officials.

By tapping those links, the agency has positioned itself to collect at will from among hundreds of millions of user accounts, many of them belonging to Americans. The NSA does not keep everything it collects, but it keeps a lot.

According to a top secret accounting dated Jan. 9, 2013, NSA’s acquisitions directorate sends millions of records every day from Yahoo and Google internal networks to data warehouses at the agency’s Fort Meade headquarters. In the preceding 30 days, the report said, field collectors had processed and sent back 181,280,466 new records — ranging from “metadata,” which would indicate who sent or received emails and when, to content such as text, audio and video.

The NSA’s principal tool to exploit the data links is a project called MUSCULAR, operated jointly with the agency’s British counterpart, GCHQ. From undisclosed interception points, the NSA and GCHQ are copying entire data flows across fiber-optic cables that carry information between the data centers of the Silicon Valley giants.

The infiltration is especially striking because the NSA, under a separate program known as PRISM, has front-door access to Google and Yahoo user accounts through a court-approved process.

The MUSCULAR project appears to be an unusually aggressive use of NSA tradecraft against flagship American companies. The agency is built for high-tech spying, with a wide range of digital tools, but it has not been known to use them routinely against U.S. companies.

White House officials and the Office of the Director of National Intelligence, which oversees the NSA, declined to confirm, deny or explain why the agency infiltrates Google and Yahoo networks overseas.

Google said it was “troubled by allegations of the government intercepting traffic between our data centers, and we are not aware of this activity.”

“We have long been concerned about the possibility of this kind of snooping, which is why we continue to extend encryption across more and more Google services and links,” the company said.

At Yahoo, a spokeswoman said: “We have strict controls in place to protect the security of our data centers, and we have not given access to our data centers to the NSA or to any other government agency.”

Under PRISM, the NSA already gathers huge volumes of online communications records by legally compelling U.S. technology companies, including Yahoo and Google, to turn over any data matching court-approved search terms. That program, which was first disclosed by The Washington Post and the Guardian newspaper, is authorized under Section 702 of the Foreign Intelligence Surveillance Act and overseen by the Foreign Intelligence Surveillance Court.

Intercepting communications overseas has clear advantages for the NSA, with looser restrictions and less oversight. NSA documents about the effort refer directly to “full take,” “bulk access” and “high volume” operations on Yahoo and Google networks. Such large-scale collection of Internet content would be illegal in the United States, but the operations take place overseas, where the NSA is allowed to presume that anyone using a foreign data link is a foreigner.

Outside U.S. territory, statutory restrictions on surveillance seldom apply and the Foreign Intelligence Surveillance Court has no jurisdiction. Senate Intelligence Committee Chairwoman Dianne Feinstein has acknowledged that Congress conducts little oversight of intelligence-gathering under the presidential authority of Executive Order 12333 , which defines the basic powers and responsibilities of the intelligence agencies.

John Schindler, a former NSA chief analyst and frequent defender who teaches at the Naval War College, said it was obvious why the agency would prefer to avoid restrictions where it can.

“Look, NSA has platoons of lawyers and their entire job is figuring out how to stay within the law and maximize collection by exploiting every loophole,” he said. “It’s fair to say the rules are less restrictive under Executive Order 12333 than they are under FISA.”

The operation to infiltrate data links exploits a fundamental weakness in systems architecture. To guard against data loss and system slowdowns, Google and Yahoo maintain fortress-like data centers across four continents and connect them with thousands of miles of fiber-optic cable. These globe-spanning networks, representing billions of dollars of investment, are known as “clouds” because data moves seamlessly around them.

In order for the data centers to operate effectively, they synchronize high volumes of information about account holders. Yahoo’s internal network, for example, sometimes transmits entire email archives – years of messages and attachments – from one data center to another.

Tapping the Google and Yahoo clouds allows the NSA to intercept communications in real time and to take “a retrospective look at target activity,” according to one internal NSA document.

In order to obtain free access to data center traffic, the NSA had to circumvent gold standard security measures. Google “goes to great lengths to protect the data and intellectual property in these centers,” according to one of the company’s blog posts, with tightly audited access controls, heat sensitive cameras, round-the-clock guards and biometric verification of identities.

Google and Yahoo also pay for premium data links, designed to be faster, more reliable and more secure. In recent years, each of them is said to have bought or leased thousands of miles of fiber optic cables for their own exclusive use. They had reason to think, insiders said, that their private, internal networks were safe from prying eyes.

In an NSA presentation slide on “Google Cloud Exploitation,” however, a sketch shows where the “Public Internet” meets the internal “Google Cloud” where their data resides. In hand-printed letters, the drawing notes that encryption is “added and removed here!” The artist adds a smiley face, a cheeky celebration of victory over Google security.

Two engineers with close ties to Google exploded in profanity when they saw the drawing. “I hope you publish this,” one of them said.

For the MUSCULAR project, the GCHQ directs all intake into a “buffer” that can hold three to five days of traffic before recycling storage space. From the buffer, custom-built NSA tools unpack and decode the special data formats that the two companies use inside their clouds. Then the data is sent through a series of filters to “select” information the NSA wants and “defeat” what it does not.

PowerPoint slides about the Google cloud, for example, show that the NSA tries to filter out all data from the company’s “Web crawler,” which indexes Internet pages.

According to the briefing documents, prepared by participants in the MUSCULAR project, collection from inside Yahoo and Google has produced important intelligence leads against hostile foreign governments that are specified in the documents.

Last month, long before The Post approached Google to discuss the penetration of its cloud, vice president for security engineering Eric Grosse announced that the company is racing to encrypt the links between its data centers. “It’s an arms race,” he said then. “We see these government agencies as among the most skilled players in this game.”

Yahoo has not announced plans to encrypt its data center links.

Because digital communications and cloud storage do not usually adhere to national boundaries, MUSCULAR and a previously disclosed NSA operation to collect Internet address books have amassed content and metadata on a previously unknown scale from U.S. citizens and residents. Those operations have gone undebated in public or on the floor of Congress because their existence was classified.

The Google and Yahoo operations call attention to an asymmetry in U.S. surveillance law: While Congress has lifted some restrictions on NSA domestic surveillance on the grounds that purely foreign communications sometimes pass over U.S. switches and cables, it has not added restrictions overseas, where American communications or data stores now cross over foreign switches.

“Thirty five years ago, different countries had their own telecommunications infrastructure, so the division between foreign and domestic collection was clear,” Sen. Ron Wyden, a member of the intelligence committee, said in an interview. “Today there’s a global communications infrastructure, so there’s a greater risk of collecting on Americans when the NSA collects overseas.”

It is not clear how much data from Americans is collected, and how much of that is retained. One weekly report on MUSCULAR says the British operators of the site allow the NSA to contribute 100,000 “selectors,” or search terms. That is more than twice the number in use in the PRISM program, but even 100,000 cannot easily account for the millions of records that are said to be sent back to Fort Meade each day.

In 2011, when the Foreign Intelligence Surveillance Court learned that the NSA was using similar methods to collect and analyze data streams – on a much smaller scale – from cables on U.S. territory, Judge John D. Bates ruled that the program was illegal under the Foreign Intelligence Surveillance Act and inconsistent with the requirements of the Fourth Amendment.