Stanford researchers show that telephone metadata – information about calls
and text messages, such as time and length – can alone reveal a surprising
amount of personal detail. The work could help inform future policies for
government surveillance and consumer data privacy.

Most people might not give telephone metadata – the numbers you dial, the
length of your calls – a second thought. Some government officials probably
view it as similarly trivial, which is why this information can be obtained
without a warrant.

A new Stanford study of information gathered by the National Security
Agency shows that warrantless surveillance can reveal a surprising amount
of personal information about individual Americans. (Image credit: Sergey
Nivens / Shutterstock <http://www.shutterstock.com/>)

The researchers set out to fill knowledge gaps within the National Security
Agency’s current phone metadata program, which has drawn conflicting
assertions about its privacy impacts. The law currently treats call content
and metadata separately and makes it easier for government agencies to
obtain metadata, in part because it assumes that it shouldn’t be possible
to infer specific sensitive details about people based on metadata alone.

The findings, reported today in the *Proceedings of the National Academy of
Sciences*, provide the first empirical data on the privacy properties of
telephone metadata. Preliminary versions of the work, previously made
available online, have already played a role in federal surveillance policy
and have been cited in litigation filings and letters to legislators in
both the United States and abroad. The final work could be used to help
make more informed policy decisions about government surveillance and
consumer data privacy.

The computer scientists built a smartphone application that retrieved the
previous call and text message metadata – the numbers, times and lengths of
communications – from more than 800 volunteers’ smartphone logs. In total,
participants provided records of more than 250,000 calls and 1.2 million
texts. The researchers then used a combination of inexpensive automated and
manual processes to illustrate both the extent of the reach – how many
people would be involved in a scan of a single person – and the level of
sensitive information that can be gleaned about each user.

From a small selection of the users, the Stanford researchers were able to
infer, for instance, that a person who placed several calls to a
cardiologist, a local drugstore and a cardiac arrhythmia monitoring device
hotline likely suffers from cardiac arrhythmia. Another study participant
likely owns an AR semiautomatic rifle, based on frequent calls to a local
firearms dealer that prominently advertises AR semiautomatic rifles and to
the customer support hotline of a major firearm manufacturer that produces
these rifles.

One of the government’s justifications for allowing law enforcement and
national security agencies to access metadata without warrants is the
underlying belief that it’s not sensitive information. This work shows that
assumption is not true.

“I was somewhat surprised by how successfully we inferred sensitive details
about individuals,” said study co-author Patrick Mutchler, a graduate
student at Stanford. “It feels intuitive that the businesses you call say
something about yourself. But when you look at how effectively we were able
to identify that a person likely had a medical condition, which we consider
intensely private, that was interesting.”

They also found that a large number of people could get caught up in a
single surveillance sweep. When the National Security Agency examines
metadata associated with a suspect’s phone, it is allowed to examine a
“two-hop” net around the suspect. Suspect A calls person B is one hop;
person B calls person C is the second hop. Analysts can then comb the
metadata of anyone within two hops of the suspect.

By extrapolating participant data, the researchers estimated that the NSA’s
current authorities could allow for surveilling roughly 25,000 individuals
– and possibly more – starting from just one “seed” phone user.

Although the results are not surprising, the researchers said that the raw,
empirical data provide a better-informed starting point for future
conversations between privacy interest groups and policymakers.

For instance, the authors point to the recent shift to reduce the metadata
retrieval window from five years to 18 months. By drawing accurate
and sensitive inferences about participants from roughly six months-worth
of calls and texts, the study suggests that metadata are more revealing
than previously thought.

Similarly, the government’s two-hop call sweep was previously three hops;
that reduction was implemented to reduce the number of people caught in a
sweep. Shortening the time window could reduce that number further,
Mutchler said.

“If we’re going to pick a sweet spot as society, where we want the privacy
vs. security tradeoff to lie, it’s important to understand the implications
of the polices that we have,” Mutchler said. “In this paper, we have
empirical data, which I think will help people make informed decisions.”

The study, “Evaluating the privacy properties of telephone metadata,” was
coauthored by John C. Mitchell <https://profiles.stanford.edu/john-mitchell>,
the Mary and Gordon Crary Family Professor in the School of Engineering,
and Jonathan Mayer, a scholar in the Stanford School of Engineering and the
Stanford Law School. Mayer is currently detailed from Stanford to the
Federal Communications Commission, where he is serving as Chief
Technologist for the Enforcement Bureau. The project was supported in part
by the National Science Foundation Team for Research in Ubiquitous Secure
Technology Research Center.