The Debian OpenSSL Bug: Backdoor or Security Accident?

On Monday, Ed wrote about Software Transparency, the idea that software is more resistant to intentional backdoors (and unintentional security vulnerabilities) if the process used to create it is transparent. Elements of software transparency include the availability of source code and the ability to read or contribute to a project’s issue tracker or internal developer discussion. He mentioned a case that I want to discuss in detail: in 2008, the Debian Project (a popular Linux distribution used for many web servers) announced that the pseudorandom number generator in Debian’s version of OpenSSL was broken and insecure.
First, some background: A pseudorandom number generator (PRNG) is a program F that, given a short random seed s, gives you a long stream of bits F(s) which appear to be random. If you and I put in the same seed s, we’ll get the same stream of bits. But if I choose s at random and don’t tell you what it is, you can’t predict F(s) at all—as far as you’re concerned, F(s) might as well be random. The OpenSSL PRNG tries to grab some unpredictable information (“entropy”) from the system, such as the current process ID, the contents of some memory that are likely to be different (for example, uninitialized memory which is or might be controlled by other processes) and so on, and turns these into the seed s. Then it gives back the random stream F(s).

In 2006, in order to fix warnings spit out by a tool that can help find memory access bugs in software, one of the Debian maintainers decided to comment outtwo lines of code in the OpenSSL PRNG. It turns out that these lines were important: they were responsible for grabbing almost all of the unpredictable entropy that became the seed for the OpenSSL PRNG. Without them, the PRNG only had 32,767 choices for s, so there were only that many possible choices for F(s).

And so programs that relied on the OpenSSL random number generator weren’t seeing nearly as much randomness as they thought they were. One such program generates the cryptographic keys used for SSL (secure web browsing) and SSH (secure remote login). Critically, these keys have to be random: if you can guess what my secret key is, you can break into anything I protect using that key. That means you have the ability to read encrypted traffic, log into remote servers, or to make forged messages appear authentic. Because the vulnerability had first been introduced in late 2006, the bug also made its way into Ubuntu (another popular Linux distribution widely used for web servers). All told, the bug affected thousands of servers and persisted for a long time because patching the affected servers was not enough to fix the problem—you also had to replace any predictable weak keys you had made while the vulnerability was present.

As an aside, the problem of finding entropy to feed pseudorandom number generators is famouslyhard. Indeed, it’s still a big challenge to get right even today. Errors in randomness are hard to detect, because if you just eyeball the output, it will look random-ish and will change each time you run the program. Weak randomness can be very hard to spot, but it can render the cryptography in a (seemingly) secure system useless. Still, the Debian bug was obvious enough that it inspired a lot of ridiculein the security community once it was discovered.

So was this problem a backdoor, purposefully introduced? It seems unlikely. The maintainer who made the change, Kurt Roeckx, was later made Secretary of the Debian Project, suggesting that he’s a real and trustworthy person and probably not a fake identity made up by the NSA to insert a vulnerability. The Debian Project is famous for requiring significant effort to reach the inner circle. And in this case, the mistake itself was not completely damning—a cascade of failures made the vulnerability possible and contributed to its severity.

But the vulnerability did happen in a transparent setting. Everything that was done was done in public. And yet the vulnerability still got introduced and wasn’t noticed for a long time. That’s in part because all the transparency made for a lot of noise, so the people to whom the vulnerability would have been obvious weren’t paying attention. But it’s also because the vulnerability was subtle and the system wasn’t designed to make the impact of the change obvious to a casual observer.

Does that mean that software transparency doesn’t help? I don’t think so—lots of people agree that transparent software is more secure than non-transparent software. But that doesn’t mean failures can’t still happen or that we should be less vigilant just because lots of other people can see what’s going on.

At the very least, transparency lets us look back, years later, and figure out what caused the bug—in this case, engineering error and not deliberate sabotage.

Comments

It’s not necessary to rely on some perceived difficulty to enter an (arguably non-existant) Debian inner circle to verify that Kurt is a real person. Compare the list of everyone who signed his 2012 gpg key http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=0x2064C53641C25E5D with the list of everyone who signed his old key in the mid 2000’s http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=0x41DC1C907244970B
Anyone on both lists has met him at least twice in person over a span of years. (We have to assume the typical passport check has no value in this kind of case.) Many of the people who have signed his key are themselves very well known, and/or can be vouched for by well-known people. For example, Bdale Garbee is a retired HP VP. And Dan Gillmor (gnu) is related to someone who signed Kurt’s key.

So, transparency didn’t help catch or prevent this vulnerability, but it does allow us to investigate it in quite some detail after the fact.

I know I don’t tend to get responses when I reply to a post; but am I understanding this correctly?

A coder named Kurt working on an open source project (Debian) begins using a debugging tool (Valgrind) to identify bugs in memory and threading usage in the open source project. That sounds plausible at the start.

The debugger issues warnings for a couple lines of code that are reading parts of memory that really the program should have no reason to read (such as “memory which is or might be controlled by other processes”) and rather than questioning in the transparency process why it was so reading memory; he just commented out the lines? I can see as a coder commenting the lines out to verify those are where the warning stems, and then to determine the purpose of those lines of code. Further I can totally understand why such a debugger would give a warning for such code based on the description here.

But the implausibility starts at the transparency portion; surely did no one questioned what those lines of code were for? Coder error no doubt, and perhaps unintentional; but if open source is good for anything isn’t it for the teamwork. Does any given coder have complete control without feedback from other coders? Or was it simply that everyone was just as annoyed by the debugger warnings that no one asked the questions as to why those lines where there in the first place.

And yet I read in what you linked the communications which show the transparency process I would expect. Where the purpose of the lines of code were indeed discussed; and yet Kurt himself was still “thinking of commenting it out” even after identifying what it was for (and apparently ended up doing just that). Kurt may be trustworthy enough to gain some inner circle into the project (infiltration is very much a part of any covert operations anyway). Now I am not accusing Kurt of being covert; just that trust is only as good as one’s understanding of the individual.

[spoiler alert]
The novels Harry Potter come to mind all through the books from the first to the last the reader is left wondering if Professor Snape is a good guy or bad guy… You only find out at the very end he was a bad guy turned good by love; and yet has to be the ultimate evil to undo the evil and wrongdoing against his love. In other words he was both good and bad, both at once; and perfect at “infiltration” in both the good and bad camps earning the complete trust of both the highest positions of each.
[end spoiler]

Now again not accusing Kurt of anything here; just that I don’t trust him anymore than I truest just some random guy on the street; even if he happened to be fully trusted by my mother and I fully trust my mother… But the trust of my trusted is still not necessarily trustworthy.

Surely if you know that you need these lines of code to gather a good random number of bits; and you comment them out, one would obviously know that the PRNG is going to be decreased in its efficacy (one may not know by how much, but one sure should look into it). This tells me the transparency and discussion didn’t do anything to stop the errant coder who knew what he was doing; for whatever reason he did it.

“What it’s doing is adding uninitialised numbers to the pool to create random numbers.
I’ve been thinking about commenting those out.
I’ve been told that using VALGRIND_MAKE_READABLE can be used to suppress those errors. So I’ve been pondering about building the library with that. I haven’t tried that this works yet though.
Martin, what do you think about this?
Kurt”

But then Ed already discussed the reasons that can be… “Transparency makes holes detectable but it doesn’t guarantee that they will be detected.” Nor does it mean that even when a backdoor is being introduced anyone in the discussion will realize it is a backdoor; it could be disguised as a fix to bug. Yet the coder knew exactly what he was doing by his own words; he knew he was creating a shortfall in the randomness of the seed pool. This leaves me to strongly question his decision. Perhaps (the benefit of the doubt) it the decision was made out of frustration to find any other method that is not going to pop up warnings from the debugger; but he still knew what he did.

“At the very least, transparency lets us look back, years later, and figure out what caused the bug”
Sure, I agree.

“in this case, engineering error and not deliberate sabotage.”
And where exactly you have found evidence for this?

Let me restate this part of your post:
1. a guy has made all efforts to appear as “a real and trustworthy person” and has made “significant effort to reach the inner circle”
2. he introduced a clear security flaw with the obscure pretext of taking away a valgrind warning (if one uses valgrind it’s hard to believe he knows nothing about software – especially if one has done “significant effort” and maintains a critical security package – so it’s very unclear who would benefit from his “fix” and why was it even useful)
3. Given 1 and 2, you “figure out what caused the bug” and conclude it’s “not deliberate sabotage”

I was wondering the same thing. Though I think that being made Secretary of the Debian Project afterwards could be a sign that he works for one of the shadowy organisations and is being pushed further up the Debian chain of influence.

Back in the 2003/2004 or so, I was working on embedding the OpenSSH/OpenSSL code, and ran valgrind on it, and ran into the exact same valgrind warning. I investigated it a bit and figured out what those functions were doing and, so I knew to not comment them out. The change the Debian developer made was a mistake, pure and simple.

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.