Siri can understand what you say. Google can take dictation. Even your new smart TV is taking verbal orders.

So is there any doubt the National Security Agency has the ability to translate spoken words into text?

But precisely when the NSA does it, with which calls, and how often, is a well-guarded secret.

It’s not surprising that the NSA isn’t talking about it. But oddly enough, neither is anyone else: Over the years, there’s been almost no public discussion of the NSA’s use of automated speech recognition.

One minor exception was in 1999, when a young Australian cryptographer named Julian Assange stumbled across an NSA patent that mentioned “machine transcribed speech.”

Assange, who went on to found WikiLeaks, said at the time: “This patent should worry people. Everyone’s overseas phone calls are or may soon be tapped, transcribed and archived in the bowels of an unaccountable foreign spy agency.”

The strategic advantage, invasive potential and policy implications of being able to turn spoken words into text are not trivial: Suddenly, voice conversations, historically considered ephemeral and unsearchable, can be scanned, catalogued and archived — not perfectly, but well enough to dramatically increase the effective scope of eavesdropping.

Meanwhile, DARPA’s intelligence-world counterpart, IARPA, announced the Babel Program in 2011, with its goal of “developing agile and robust speech recognition technology that can be rapidly applied to any human language in order to provide effective search capability for analysts to efficiently process massive amounts of real-world recorded speech.”

Robert Litt, who as general counsel for the Office of the Director of National Intelligence is the intelligence community’s chief lawyer, was asked about the NSA’s speech-to-text capabilities at a forum on transparency on Capitol Hill on Friday.

He took the opportunity to lash out at The Intercept’s reporting: “I think that story is a great example of what is wrong with a lot of media coverage of this,” he said. “That story made absolutely no distinction between technical capabilities and legal authorities. There are all sorts of technical capabilities that NSA has. I’m not commenting on the existence or nonexistence of any such authority. The question is when are they used and what are the legal authorities under which they are used. And I think that that’s something that a lot of the press reporting completely ignores, including that story you wrote.”

Asked to explain in what ways the use of speech-to-text is limited, Litt repeatedly refused to even acknowledge its existence.

“I’m not saying that the government isn’t using these techniques. I am not acknowledging that these techniques exist even.”

You won’t hear much about the use of speech recognition for surveillance in academe, either.

Researchers in the field are divided between those who don’t take NSA funding, and can only speculate about what goes on over there — and those who do take NSA funding, but won’t say what they know.

“There’s a lot of weird hush-hush that goes on,” said Bhiksha Raj, an associate professor at Carnegie Mellon University’s Language Technologies Institute, who said he does not receive NSA funding. “Academics who work for the NSA must go through various clearances. They sign several papers. They hold closed meetings that are only attended by people with clearances.”

Some non-NSA affiliated academics were once “quite keen” on seeing how the NSA was faring in the face of the technical challenges in the field, Steve Young, a professor of information engineering at the University of Cambridge, recalled. “But unless you actually work for the NSA and you’ve been vetted, you’re not going to get close to the real data.”

Ironically, even GCHQ, NSA’s intelligence partner in the U.K., has complained about DARPA and NSA’s secrecy. A 2009 GCHQ assessment of speech-to-text technology said that “The DARPA evaluation programme, with significant steer from NSA, has been the main driving force behind technology improvements in the field. Unfortunately, the results of the evaluations are not put in the public domain, making reference difficult.”

All the secrecy has an obvious advantage for the NSA. If the NSA can keep their speech-recognition capabilities secret, nobody can tell them what to do. And if nobody knows what they are doing, then nobody can tell them to stop.

Senator Ron Wyden, D-Ore., arguably the foremost congressional critic of NSA overreach, wouldn’t comment directly on the question of speech recognition. But, he said through a spokesperson: “After 14 years on the Intelligence Committee, I’ve learned that senators must be constantly on the lookout for secret interpretations of the law and advances in surveillance that Congress isn’t aware of.”

He added: “For centuries, individual privacy was protected in part by the limited resources of governments. It simply wasn’t possible for governments to secretly collect information on every single citizen without investing in massive networks of spies and informants. But in the 21st century mass surveillance is no longer difficult and expensive — it’s increasingly cheap and easy. The only privacy protections that will matter in the future are the ones that are written into law and defended by public demand for freedom and openness.”

Research on the Snowden archive was conducted by Intercept researcher Andrew Fishman.

Illustration by Richard Mia for The Intercept

We depend on the support of readers like you to help keep our nonprofit newsroom strong and independent. Join Us