Assistant Professor
ECE Department
University of Maryland, College Park

Best Papers From USENIX Security 2016

This year, the Best Paper Awards at USENIX Security went to two papers
from the field of data-driven security. These papers are great examples
of the benefits of systematic data collection and analysis for answering
new questions on topics that have traditionally relied on math and
experimentation. The data sets analyzed in these papers were key to the
research results reported.

The paper “The Million-Key Question—Investigating the Origins of RSA Public Keys”, by Petr Švenda from Masaryk University, Czech Republic, and his
colleagues, teaches us how to fingerprint RSA public keys. This result
is surprising because (i) RSA keys should look like uniform random
numbers and (ii) Švenda’s method works without finding the prime factors of the RSA modulus. In other words,
you don’t have to break RSA to find out which algorithm implementation
generated the key. After testing 22 software libraries and 16 smart
cards and creating 1 million key pairs with each implementation, the
authors discovered artifacts in the most significant bits (MSBs) of the
public keys. These artifacts result in 13 clusters of keys. Given a key,
the Masaryk researchers can guess the correct cluster with a 40%
accuracy on average. While this gives worse odds than a coin flip, the
gap between the distribution of real public keys and the theoretical
space of RSA moduli is striking. Besides, with 10 keys from the same
implementation, the guessing accuracy increases to 85% on average. This
enables attacks such as quickly searching for keys from a vulnerable
library or linking related Tor hidden service operators. To mitigate
these attacks, the authors have developed a Web service for testing multiple keys and selecting
the most anonymous one. These estimates assume that RSA implementations
are equally popular; however, if all real-world keys come from, say,
OpenSSL, then fingerprinting would not work. To answer this puzzle, the
Masaryk team classified 11 million real keys (for example, from TLS
scans or PGP key dumps) and compared the prevalence of the key clusters
with the market share of the respective products. The numbers matched
within a few percentage points. The paper also discusses the 7
implementation choices that bias the MSBs, such as directly manipulating
the high order bits of the RSA primes or avoiding small factors.
Remarkably, all these implementation choices make factorization more
difficult. However, the lack of agreement about the best way to generate
keys reduces the anonymity set of users who rely on these keys. Unlike
other implementation flaws found by reviewing the code (including a
previous method
to fingerprint OpenSSL keys based on their prime factors), this
threat became clear only after analyzing millions of keys from
real-world RSA implementations. The researchers released their data sets
here.

The paper “Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks”,
by William Melicher of Carnegie Mellon University and his colleagues,
describes a machine learning tool for predicting how easy it is to guess a password. The tool is essentially a strength meter that incorporates
recent advances in password research. In the past five years we’ve seen
a resurgence of this old topic, focusing mainly on measuring password
strength. For example, researchers have shown that the widely
implemented FIPS guidelines for evaluating passwords measure the wrong thing
(they don’t account for the fact that password cracking tools become
better with each new password leak, as observed in practice), that
password expiration policies are a measurably bad idea
(as Bruce Schneider has been
arguing),
or that humans can learn strong passwords
as long as they don’t have to invent them (after all, we used to
remember whole address books of 10-digit phone numbers). The new
strength metrics focus on guessability, which tries to estimate how
many guesses a skilled hacker would need to break the password.
Unfortunately, evaluating guessability involved hand crafted and slow
heuristics (such as the word mangling rules from popular password
crackers) and needed gigabyte-sized dictionaries. Instead, the CMU researchers trained a neural network to predict the next character of a password given the preceding characters. They used 105 million
passwords, collected in their prior studies or leaked by various
organizations, to train and validate this classifier. The neural network evaluated guessability more accurately than four previous techniques and two client-side password meters (which often overestimate password strength). With a few clever tricks to compress the model, the
classifier can check a password with sub-second latency and requires
only a few hundreds of kilobytes of storage. This makes it suitable for
a real-time password meter, bundled with a client side encryption
application (like GPG or TrueCrypt) or used in a Web site sign-up form.
The researchers released their tool on
Github. This
approach has a drawback: unlike hand crafted rules, the neural network
is a black box that cannot explain why the chosen password is weak or
how to improve it. Nevertheless, as the first paper to apply machine
learning to the problem of measuring password strength, this work opens
a new research direction for the study of passwords.

As well as being instances of good research, these two papers also illustrate
the unique insights that data driven methods can provide to security
problems. Development-time assumptions often do not hold in the real
world, and as a result software behaves in surprising ways that
make security a moving target. Without measuring the statistical
properties of public keys on a large corpus, the Masaryk team would not
have been able to observe the idiosyncrasies of various crypto
libraries. Defenses are also more effective when they account for the
capabilities and limitations of real-world adversaries. Without the
opportunity to learn the features of millions of real passwords, the CMU
neural network would have been an inaccurate model of a hacker’s
guessing ability. Similarly, answering some of the questions raised by these papers
(such as how to make ML-based password meters more usable) will require
further data-driven research.