Building Better Internets

A Taxonomy of PRISM Possibilities

I have been fielding a decent number of calls and emails from reporters on the NSA PRISM scandal. A lot of people are trying to synthesize reasonable technical explanations for how the NSA could implement the program described in the leaked PowerPoint deck and keep it secret for so long. In an effort to improve the quality of the public discussion, I have decided to create a taxonomy of the theories that I have seen floated and supply my own commentary in italics.

To be clear, I have no special knowledge or insight into this program. Everything listed below is based upon data contained in the news articles I have seen. I also recognize that many of these theories sound far-fetched, although I have to admit that my personal Overton Window for crazy conspiracy theories has shifted in the last 24 hours.

My goal is to keep this list up to date as more information is published, so please let me know if you have any corrections or additions by leaving a comment or via email. My GPG key is available here.

The list is below the fold…

The Taxonomy of PRISM Possibilities

The PRISM program does not exist. – I think we can safely eliminate this possibility due to official confirmations.

The PRISM program exists…

…and gathers data only after an individual is targeted. – This would make PRISM a more advanced version of the standard “Law and Order” wiretap, where a target needs to be identified before data collection begins. This seems to be contradicted by the available slides and reporting.

Individuals are targeted with the cooperation of the listed companies. – This runs contrary to the very loud denials by these companies, although some options might fit between the weasel words utilized.

The NSA requests data to be collected using existing facilities. – This would be equivalent to the standard lawful intercept and subpoena process at these companies, although under a different legal basis.

The NSA collects data using dedicated facilities. – An example would be an NSA-operated sniffer in corporate datacenters. This would be a complicated system to implement at the scale of a Facebook or Google, and would be difficult to hide from employees.

Individuals are targeted without the cooperation of the listed companies.

The NSA captures traffic using backbone/POP sniffing that is activated only after targeting.

The NSA is using a US Government Certificate Authority or Intermediate CA to terminate TLS. – This would be easy for technically sophisticated targets to detect, and wouldn’t work on recent Chome and Firefox thanks to HPKP.

The NSA has obtained private keys from the listed companies.

The NSA is using a new cryptography breakthrough to impersonate the server. – Being able to factor public keys would work, for example.

The NSA passively sniffs the traffic of targeted individuals.

Only unencrypted traffic is captured.

Encrypted traffic is captured and decrypted. – This is discussed more below.

The NSA captures traffic with the cooperation of last-mile ISPs. – This seems to be incompatible with the claim that this program is mostly/totally used on foreign nationals.

The NSA has surreptitiously installed hardware or software backdoors at the affected companies. – This is the “Aurora Attack” option, and reflects how the PRC and other non-US governments obtain cloud data from US providers. This is both legally dangerous and technically unstable, and does not seem likely from the slides.

…and gathers large amounts of information indiscriminately. – This is how most observers are reading the slide deck, and is even compatible with the official statements mentioning “minimization” of data after it is collected. Kurt Opsahl of the EFF pointed out to me that how the intelligence community uses the word “collect” is not the same as the common understanding.

Broad data sets are gathered with the cooperation of the listed companies.

The NSA can login to backend databases and “pull” the data they want. – This would be a classic idea of a backdoor.

The companies regularly batch up the data sets and send them to the NSA using existing lawful intercept systems.

The NSA installs hardware on-site that receives data copied to it intentionally by the companies.

Broad data sets are gathered without the knowledge of the end-providers. – The general idea of the NSA collecting this data in-flight on the network is strongly supported by the PRISM slide discussing the fact that a great deal of international traffic is routed through the US. This would be irrelevant if US companies with overseas datacenters were providing the data directly.

The NSA is passively sniffing huge amounts of traffic on backbones and at interchange points. – We know that the NSA has hardware in place at major ISPs thanks to the EFF’s lawsuit against AT&T. While it would be impossible to copy all of this traffic to NSA datacenters and then store it, it would be feasible to build a multi-tier architecture where data was screened and discarded at multiple levels (although not for $20M). For example, the local sniffing equipment would use ASICs or FPGAs to quickly inspect every packet and filter out useless information, such as YouTube and Netflix streams or unencrypted connections to popular websites. It would keep interesting streams, such as an HTTPS connection to Gmail, and forward that to NSA facilities for storage and further analysis. But what about encryption?

The NSA is only gathering unencrypted traffic. – While a lot of interesting data is still not encrypted in transit, the list of services in the PRISM slides include several that have been encrypted since availability.

The NSA is decrypting traffic using a non-public breakthrough in cryptanalysis. – This is, coincidentally, the topic of a talk I will be giving (with Tom Ritter, Tom Ptacek, and Javed Samuel) at Black Hat USA this summer. We are not saying that such a breakthrough exists, but that recent advances in solving discrete logarithms should make us wary of using RSA and other asymmetric algorithms based upon “multiplication is easy, division is hard” problems. A fast factoring algorithm would allow the NSA to unwrap TLS connections using an RSA handshake anytime after they were collected, although this should not work against connections negotiated with Perfect Forward Secrecy. Google implemented PFS in 2011, although most other companies have not.

The NSA is decrypting traffic using the private keys of these companies.

The NSA stole private keys from all of these companies.

The NSA convinced these companies to turn over their private keys. – This is a way that these companies could cooperate with the NSA without large numbers of employees being involved.

The NSA is doing active middle-person attacks against large amounts of traffic on backbones and at interchange points. – This seem infeasible to me technically and politically, and is not compatible with the passive splitter configuration the EFF uncovered at AT&T.

The NSA is intercepting traffic in the datacenters of the listed companies without telling them. – This is not impossible. Perhaps the NSA has installed sniffers at these facilities with the promise of providing collective defense against APT and other attackers and lied about the real purpose. This would mean that these companies were more naive than malicious. This option could avoid cryptography problems if the hardware was installed in the correct place. Again, this would be difficult to implement at the scale of a Google or Facebook without their fully-informed cooperation.

The PRISM Program is mostly run by private contractors, who turn over data meeting certain standards to the Government. – This would allow the companies involved to make narrow claims about not turning over data directly to the government.

The PRISM Program is just the last analysis and correlation step performed on data gathered through many different means. – This is one way to accommodate all of the different leaks and denials. Maybe PRISM gives a single view into diverse data sets, such as Skype traffic turned over intentionally and Google data from sniffing. It would also explain the shockingly low price tag of $20M advertised in the slides if that data was gathered by other NSA projects with much larger budgets.

Given that the leak is pretty much power point slideware, we should also consider the possibility that the program exists but the person who put together the slide deck misunderstood or oversold its capabilities. That would also be consistent with all of the available evidence, though I’m not quote optimistic enough to believe it.

The leak was by an exCIA op working as a third party IT contractor to the NSA. He knows full well what is means and is capable of. I find your theories misleading and confusing. I agree you have tried to cover all bases. But past news articles have shown that some of the theories are just not valid. For instance there has been past acknowledgements of the NSA using a form of splitter at data centers in closed off rooms to monitor data.

The US government has advanced data analysis systems that allows them to view various data inputs in ways that are helpful to them to track/monitor targeted individuals to try and prevent potential criminal acts and make cases against actual criminals. I have held a similar position to Mr. Snowden in the past. Focusing on how they are collecting the data may be of technical interest but the real focus should be on the ethical standards that are being enforced in both the targeting of individuals and the handling of the data. I can 100% guarantee from my own experience that in the past high ethical standards where only targeted non-US citizens or targeted US citizens with proper authorization were the focus of data collection (in my large project). However, we must accept the fact that non-targeted US citizens that interact with targeted individuals will have their data collected and analyzed. If US citizens don’t accept this fact then there is pretty much no way for the agencies that are trying to protect us and arrest criminals to perform this very fruitful work. The only concerns are: 1) Is proper oversight being done, 2) is the data being properly handled 3) Are there controls to prevent/minimize overreach.

> This would make PRISM a more advanced version of the standard “Law and Order” wiretap, where a target needs to be identified before data collection begins…. Individuals are targeted with the cooperation of the listed companies. – This runs contrary to the very loud denials by these companies, although some options might fit between the weasel words utilized.

There have been no loud denials that they provide information about individuals in response to a warrant-type request. The loud denials are that they provide open slather with no targeting, ie your case 2.B.

About:

Alex Stamos is the CSO of Facebook, although this is a personal blog and does not reflect an official viewpoint. Previously, Alex was the CISO of Yahoo and co-founded Artemis Internet and iSEC Partners. This blog has been left up for archival purposes, Alex now writes at https://www.facebook.com/alex.stamos