Information Security with Colin Percival

Michael W. Lucas: Who is Colin Percival, and why should we listen to
him?

Colin Percival: To the first question: I'm a visiting researcher at
Simon Fraser University in Canada, as well as being a deputy security officer
with the FreeBSD project. I started my B.S. in mathematics at age 13
(concurrent with high school) and graduated with a first-class honors degree
in 2001; to the extent possible within the confines of an undergraduate degree
program, I had an emphasis on number theory and computer algorithms. I then
went to Oxford University, where I recently defended my doctoral thesis in
computer science.

To the second question: you should listen to me because I have written a
12-page academic paper presenting and discussing a serious security
vulnerability, and nobody has been able to refute my results. I believe that my
work stands on its own; it doesn't need my name attached to give it
credibility.

MWL: I saw your hyperthreading presentation at BSDCan,
where you demonstrated weaknesses with the combination of cryptographic
software and hyperthreading. Can you explain your work in a way that those
folks who aren't cryptographers, and who find that 12-page paper intimidating,
can understand it?

CP: Well, this is a bit of a strained analogy, but here
goes. Imagine that you work in computer technical support. You're really good
at your job, but you can never answer any questions on your own--instead, you
have a collection of really good computer manuals spread out on your desk, and
every time someone asks you a question, you have to refer to the appropriate
manual to work out what the answer is.

Now, because you're so good at your job, you don't just help one customer at
once; instead, you have two phones, and you try to help two customers at once,
alternating back and forth between them. If the two people you are helping are
asking about completely different problems, then you can have both manuals open
to the right pages, and this works very well; but if you've got two people who
need answers from the same manual, then you have to spend time flipping the
pages back and forth, which means that it takes far longer for you to answer
each question.

Since the two people you are helping can measure how long it takes you to
answer their questions, they can determine if you have to flip pages--in
other words, they can determine if the other customer is asking you questions
for which you need to refer to the same manual in order to answer.

In my paper, "you" are the CPU, the two people you are helping are two
computer programs, and the reference manuals you are using is the main memory
of the computer. By measuring how long it takes for the CPU to perform some
calculations, a program can work out which parts of the computer's main memory
are being used by the other program. Once it knows this ... well, it's hard to
explain further without explaining technical details about how encryption
works, so let it suffice to say that knowing which memory locations are being
accessed while some data is being encrypted or decrypted can very often tell
you what the data, or the secret key being used, is.

MWL: A real system has a lot of things going on at the same
time, though. To stretch your analogy even further, you don't have two callers
on the line--you have dozens, or even hundreds. You were able to recover
hundreds of bits from a key in your environment, but it appears, to me at least, to be a bit different from recovering an actual key on an actual system
in the real world; even the lowest-use system has a lot of processes scheduled.
Isn't this vulnerability largely theoretical?

CP: You're stretching the analogy in the wrong direction.
You're right that there may be dozens or even hundreds of programs running on
a busy system--or, in our analogy, that many customers phoning for technical
support--but what matters is how many programs are running at once. On a
hyperthreaded CPU, there are only ever two programs being run at any given
time; all the rest are waiting for their turn (again, much like technical
support lines!).

All an attacker needs to do is get his code running on the CPU at the same
time as the program he is trying to spy on--and he only needs to do this
once. If he doesn't succeed on his first attempt--that is, if some other
program is selected to run at the same time as his target--then he can just
try again. A computer that is running at 90 percent of capacity is a very heavily
loaded system, but even there the attacker would only need to make ten
attempts on average before he succeeded; and each attempt takes a fraction of a
second and is for all practical purposes undetectable.

In short, the vulnerability is entirely practical. In my paper I make some
assumptions for the purposes of making it easier to explain and demonstrate the
attack, but the difficulties an attacker would encounter in the real
world are far from insurmountable.

MWL: If I recall correctly, at BSDCan you said that you
expected someone with the proper background could write a functional exploit
for this vulnerability in only a few days. It's been a few weeks now; have you
seen such an exploit yet?

CP: No; but if someone was planning on exploiting this, I'm
sure they wouldn't announce it to the world by publishing their code.

MWL: Well, when an exploit comes out I'm sure you'll hear
about it, probably by the guy who has to clean it up.

You had your own ideas about how people could use this. I'm sure that other
people have given you even more ideas. What other interesting, surprising, or
creative ideas have other people given you for exploiting hyperthreading?

CP: Well, I demonstrated an attack against RSA using
hyperthreading; Osvik and Tromer have also performed an attack against AES. (I
don't know all the details, since they haven't published their work yet.)
Beyond these two cryptographic attacks, it becomes harder to define what
constitutes an "exploit"--clearly stealing someone's keys is a problem, but
what about watching the pattern of their keystrokes when they type in their
password? What about determining which records are being accessed in a
database? What about distinguishing between someone running vi and someone
running emacs? The way software is designed at the moment more or less
guarantees that information will be leaked on a hyperthreaded processor--the
only question is how much people care.

The most interesting feedback I've received has not been about new ways of
exploiting hyperthreading, but instead has been historical: it seems that
several people independently investigated the possibility of information
leakage via caches over 20 years ago, but were "discouraged" from further
research by the NSA before they obtained any significant results.

MWL: If I'm understanding you correctly, you think the line
between an exploit and an interesting-but-unimportant security bug is personal.
Obviously we all care when the script kiddies have their point-and-click
RSA-smasher, but there are many degrees of gray beneath that level. Someone may
consider your HT vulnerability moot, but others will find it very serious
indeed. This is something we don't hear enough about in security--it's all
relative to your situation.

CP: I wouldn't say that the line between "dangerous" and
"interesting but unimportant" is personal, since that implies that there is a
subjective element to security analysis. Rather, I'd say that the line between
"dangerous" and "interesting but unimportant" depends upon whether you are
directly affected. You could think of it as being like an automobile accident--if you never leave home, you don't need to worry about being in an
automobile accident, but if you do leave home, it's something that you need to
take precautions against (e.g., learning to drive safely).

MWL: At BSDCan, you mentioned that you had learned a lot of
new and interesting things about yourself in the interval between announcing
the paper and actually releasing it. Someone had posted that you had a vendetta
against Intel, and someone else said that you were manipulating the stock price
for personal gain. How much of that reaction does a security researcher put up
with when he announces his results?

CP: Well, I should start by clarifying here that the
rumors were largely my fault, since they resulted from the unusual disclosure
schedule for this issue. The time at which I released all the details of the
attack was set by the conference schedule--I was giving the first talk of the
conference, at 10 a.m. EDT on Friday the 13th--but I wanted to provide
everybody with a chance to fix this on Friday regardless of which time zone they
happened to be in, so at 8 p.m. EDT the preceding evening I announced that an
unspecified problem existed and that I would be releasing the details the
following morning.

This window of 14 hours resulted in some rather interesting rumors flying
around, since the only evidence people had on which to judge my claims was my
reputation together with the FreeBSD security advisory that was sent out. (No
other advisories were released until SCO sent out their advisory on the 13th.)
Some people claimed that this was all a hoax; others, pointing to the FreeBSD
security advisory and the lack of Linux security advisories, said that
"obviously, Linux is not affected"; but the most amusing claim I heard came
from someone who pointed to my web page where I describe myself as being
"unemployed [because] I wanted to spend my time making sure that this issue was
properly fixed, rather than earning some money" and concluded that I was "a
disgruntled unemployed programmer, trying to make some money by short-selling
Intel shares." (Coincidentally, someone did sell 10 million Intel shares at
10:56 a.m. on May 12--but according to reports I've read, this was
simply a trading error, where the intended sale had been of 10,000
shares.)

In general, security researchers don't have to put up with very much along
these lines. In this particular case, I had to put up with quite a lot of
rumors--but I didn't really mind, since they provided some valuable comic
relief, and once I published the details of my attack, nearly everyone accepted
that it was valid immediately.

MWL: I saw advisories from SCO and from FreeBSD, but I'm
sure you contacted other vendors: Microsoft, Sun, Red Hat, SGI, and so on. How
did people react; did they take you seriously or blow you off? How did they
work with you? I'm especially curious about how Intel reacted to being told
that there is a basic problem with their design.

CP: In general, I had no trouble convincing security people
to take this problem seriously. As one person put it, "Anything which comes
from the FreeBSD Security team immediately has an air of credibility to it,"
and while I was reporting this problem in my personal capacity as a security
researcher, the fact that I am part of the FreeBSD security team certainly
helped. Beyond that, the individual reactions were quite different; but rather
than addressing each vendor's response individually and in detail, which would
take many pages, I'll just give the highlights--in the form of awards for
exceptional performances.

The prize for most professional response goes to SCO. I must admit to
having been rather surprised by this, in light of the public disagreements
between SCO and the free software community, but SCO's response to this issue
was really quite superb. Out of all the members of the Linux vendor security
list, SCO was the first to request further details after I posted to indicate
that there was a problem; they were the first to respond back with detailed and
intelligent questions; when I asked for vendor statements, they were the first
(and only Linux) vendor to respond; and they published an advisory only a few
hours after the embargo on the issue ended.

The prize for most corporate attitude goes to Intel. I had some trouble
establishing contact with them in the first place--not that I can assign much
blame for this, since Intel, unlike operating system vendors, has not had much
experience in dealing with security flaws--but even once I found someone who
was willing to talk to me, our conversations were rather less than useful: as a
general rule, I would ask questions (e.g., Would it be possible for you to
produce a microcode patch as follows ... or How about making the following
changes in future processors ...), and the reply would invariably be "I'm sorry, but I'm not allowed to talk about that." Worse, once it became clear
that my recommendation--and FreeBSD's response--was going to be to disable
hyperthreading by default, Intel shifted completely into damage control mode,
discarding all attempts at a reasoned security-centric response in favor of
treating this simply as a public relations exercise.

The prize for most personally helpful goes to Mike O'Connor of SGI. As
little communication as I had with Intel, I'm sure I would have had even less
were it not for Mike's help: when I explained to him the difficulties I was
having with Intel, he took advantage of the established channels that SGI had,
by virtue of being a large customer, to remind Intel that it was important to
talk to people who discover security vulnerabilities.

The prize for least communicative goes to Microsoft. I was very amused
recently to read the following in a story on eweek.com:

"We respond immediately to the initial vulnerability report and provide the
researcher with contact names, e-mail addresses and phone numbers. We make it
clear we want to work closely with the researcher to pinpoint the problem and
get it fixed. We commit to providing [researchers] with a progress report on
the Microsoft investigation every time they ask for one," [MSRC program manager
Stephen Toulouse] said.

My experience with Microsoft was quite the opposite. When I first reported
this vulnerability to Microsoft, I was thanked, given a ticket number (5834),
and told that it would be handled by "Christopher"--no last name, no phone
number, and no direct email address. Later the issue was transferred to "Brian"--but again, no contact information was provided. Despite comments from
multiple third parties that Microsoft was "very concerned" and had "several
people" working on this issue, Microsoft did not "make it clear they wanted to
work closely" with me--in fact, they ignored all my attempts at cooperation.
Finally, when I sent emails to Microsoft asking for a progress report, I
received no response. Even now, a month after I published the details of this
vulnerability, I have received no communication from Microsoft to say if--let
alone how--they intend to respond to this issue.

Finally, the head in the sand prize goes to Linus Torvalds. On Monday, May
16, three days after I published all the details of my attack, Linus wrote
that he would "be really surprised if somebody is actually able to get a
real-world attack on a real-world pgp key usage or similar out of it (and as to
the covert channel, nobody cares). It's a fairly interesting approach, but it's
certainly neither new nor HT-specific, or necessarily seem all that worrying in
real life." I really don't know where to start with this, except perhaps to say
that I'm very glad that Linus isn't responsible for keeping my computer
secure.

MWL: SCO gave the best response? I'm sure a lot of people
will find that surprising. I guess it just demonstrates that the people doing
the work aren't the same people that are making policy, and that the vendors
who aren't taking it seriously will find out just how real-world this can
be.

CP: I think it's a bit more subtle than that. SCO is the
heir to Unix, so they've had a lot of time to mature; and their customers are
probably highly weighted toward the "upgrade once a decade,"
hyperconservative server end of the spectrum--which is exactly where this is
the most dangerous. Companies tend to adopt the same attitudes as the people
who buy their products, so I'm not at all surprised that a company that deals
with very conservative server-buying customers had a far better response than a
company which deals mostly with security-unaware desktop users.

MWL: To an outsider looking in, it seems that this took a
long time to work out. A layman might think that you would just have an idea,
hammer out some demo code, mail some vendors, and be done with it. I'm sure
it's not that simple. What does a security researcher actually do on a
day-to-day basis?

CP: Well, I'm not a very good example of a security
researcher, in that respect. Most security researchers--or at least, most
people who call themselves security researchers--spend most of their time
combing through source code looking for bugs. This is certainly useful, but it
doesn't require very much skill, and I suspect that this is a task that
will be taken over by computer programs before long, since most security flaws
fall into the "stupid mistake" category and are very easy to recognize if you
look closely enough; I've been particularly impressed in this respect with
results from software produced by Coverity.

As for what I think your real question was--why it took such a long time
before I announced the problem--well, it all comes down to lots of details.
To start with, when I first realized that this was likely to be a problem, I
was in the middle of editing my D.Phil. thesis prior to sending it off to my
examiners. While walking to and from college every day--I was living in a
house about 2 kilometers outside of Oxford, and without an internet connection--I
convinced myself that the problem was probably real, but it wasn't until over a
month later, when I went back to Vancouver for Christmas, that I had time to
sit down and write some code.

By the end of 2004, I had some working code and I had demonstrated that it
could steal enough information to make breaking RSA easy, but this wasn't
enough to write to people yet. Extraordinary claims require extraordinary
evidence, and while I had the necessary evidence, it was scattered between
dozens of files and scraps of paper, and nobody would have the patience to read
and understand it all it its current form. Consequently, I started to write my
paper, "Cache missing for fun and profit," in order to clearly explain why
there was a covert channel between threads executing on the same processor
core, how this channel could be exploited as a side channel, and how it was
possible to defend against this attack.

I finished a first draft of this at the end of February--after being
interrupted partway through by my thesis defense--and started writing to
security people to inform them of this vulnerability. For the next two months,
my role became less that of a researcher and more that of an educator: while my
paper was largely self-explanatory, for nearly every point I made there was at
least one person who needed me to provide additional explanation, and there
were many things--potential fixes, for example, which I had decided wouldn't
actually work--that I didn't mention in my paper but still had to explain to
several people.

Toward the end of April, I went through my paper making substantial
revisions, based on feedback from the various security teams with which I had
been in contact, and then I prepared the patch for FreeBSD--only to end up
feverishly rewriting the patch during the FreeBSD developer summit the day
before my talk, after I realized that my original patch would had inadvertently
ended up disabling dual-core systems.

Of course, while I was doing all of this, daily life continued as usual. As
a FreeBSD deputy security officer, I was responsible for dealing with the more
common "dumb bug" sort of security issues, so I wrote patches and advisories
for half a dozen other security problems (including one I found myself)
during the period that I was working on this.

I guess the most important point to realize here is that it's one thing to
realize that something might be a problem, and another to write the code to
exploit it; but it is quite different, and a lot more work, to liaise with over
a dozen vendors to explain what the problem is and how it should be fixed.

MWL: It sounds as if it would have been easier to write an
exploit for this than to bring this to vendors correctly as you did!

CP: Quite likely, yes. Of course, considering the wide
range of operating systems affected by this, I wouldn't want to distribute
exploit code even if I had written it, so the route I took was the only
reasonable approach to ensuring that everybody could fix the problem (even if
some of them seem to have not bothered).

MWL: You're also a part of the FreeBSD security team. I'm
sure you get a fair number of emails from panicked users, false bug reports,
"security holes" that are actually the result of incorrect sysadmin practice,
and so on. What's it like being on the FreeBSD security team? Are there any
choice tidbits you'd care to share from that experience?

CP: Many people have said that war is "months of boredom
punctuated by a few minutes of hell"; being on the FreeBSD security team isn't
quite that intense, but there is certainly a similarity. Most of the time,
there aren't any major problems that need to be dealt with, but we never know
when the next big attack is going to happen. Of course, that's only one side of
the security work I do with FreeBSD; in addition to handling security issues as
they are found, I've spent much of the past two years improving our
infrastructure for distributing updates. It doesn't help very much to discover
and produce a patch for a security flaw if none of your users actually apply
the patch, for example, so I wrote a tool called FreeBSD Update that allows
people to securely download and install security updates very easily.

As for choice tidbits: there are inevitably some amusing things that happen, but I'd rather not give specifics; people have confidence that they can
write to the security team to tell us about potential problems without having
their correspondence discussed outside of the security team, and I don't want
to betray that trust.

MWL: Your updating tool sounds promising; would you care to
give our readers a brief description and tell us what problems you're trying
to solve?

CP:Sure. To start with, you have to understand that FreeBSD
is an open source operating system; all of the source code used to
build it is available, and anyone who knows what they are doing can take this
source code, make changes to it, recompile it, and thereafter run their own
personal version of FreeBSD.

This is all great, but it leads to a certain disconnect between the
developers--for whom "remove | S_IRGRP | S_IROTH from line 105
of sys/dev/iir/iir_ctrl.c and recompile" makes sense--and normal
users, who aren't interested in writing code but instead simply want a system
that works. Historically, security issues have been handled in FreeBSD by
distributing a source code patch along with a list of instructions for applying
the patch and rebuilding; this had the effect that most users would decide that
applying the security patches was far too much work, and would instead simply
leave their systems insecure. A doctor once remarked to me that it's one thing
to diagnose an illness and another to prescribe the correct treatment, but
that the really hard part is making sure that a patient actually takes the
medicine--the situation is exactly the same with security issues, in that the
really hard part is to make sure that users actually keep their systems up to
date.

This is where FreeBSD Update comes in. I take the source code patches that
fix security problems, and recompile everything; then I check to see which
files have changed, package the new files together, and put them online for
people to download. Instead of needing to apply the source code patch and
recompile everything--which can take well over an hour on a slow machine--users simply run the FreeBSD Update client, and watch as it downloads any
necessary security updates for them--very much the same way as Windows Update
works.

Incidentally, there has been one very interesting spin-off benefit of my
work on FreeBSD Update: In order to speed up the downloading of binary security
updates, I wrote a simple "delta compression" program, which compares two files
and outputs a small "binary patch," which encodes the difference between them.
Since people normally have an old, insecure version of the files for which
they are downloading security updates, it is usually possible to download this
much smaller patch file. Much to my surprise, I discovered that the patches my
utility was producing were several times smaller--and thus faster to download--than patches produced by any other available software. Since then, my binary
delta compression utility, bsdiff, has been increasingly widely used, most
notably in OS X (for distributing software updates) and in the next major
release of Mozilla Firefox.

MWL: You've provided ideas for people looking to make
crypto useless on hyperthreaded machines, created software used by major
companies to reduce the cost and complexity of software updates, and nailed
down your Ph.D. What's next?

CP: As I think I mentioned before, my attack is just the
latest in a long series of side-channel attacks on cryptographic
implementations. The unfortunate fact is that the existing libraries of
cryptographic functions were either written before side-channel attacks became
well known or were written by people who were largely unaware of the problem.
Each time that a new attack is published, cryptographic libraries are rewritten
to protect against that latest attack, but there has been no attempt made so
far to produce a library that is designed from the ground up to be immune to
entire classes of side-channel attacks. As I mention in my paper, this can be
done by adopting a more restricted model of computation, which prohibits
data-dependent branches or memory accesses. Assuming that I can find some
funding to support this, I'd like to write such a library.

In the longer term ... well, when Guy Consolmagno was appointed the Vatican
astronomer, he was given three words of instruction: "Do good science." That's
the sort of job I'd love to have: one with no strings attached, nobody telling
me what research I should or should not be doing, and nobody telling me what I
can or cannot publish--just absolute freedom to do research. Realistically, I
suspect that the closest I'm ever likely to get to that ideal is if I get a
tenured appointment at a university, so that's where I'm currently hoping to
end up; but wherever I go, I intend to continue doing research. The problems I
do research into are often inspired by practical issues, as with my delta
compression work, so it is quite likely that I'll continue to produce useful
software; but that is simply a side benefit. My real interest is the
research.