If your intermediate certificate's signing keys are on the internet facing web servers you're doing it wrong. That intermediate signing key should be treated with the same level of security you would treat a root key with.

So as far as I can tell, his rant is essentially that people should not use custom allocators and instead rely on the general purpose one built into libc because they can add system wide tools there.

I can see the argument for most cases, that is kinda the point of a general purpose allocator, but encryption (esp if you are doing lots of it) really strikes me as a case where you can really benefit from having explicit control over the behavior. I have worked on a number of applications where custom allocators had significant (user facing, not just benchmarks) impacts on performance. Ironically it also meant we were able to do better checking then the general exploit detection kits since we could bake more specific knowledge into the validator.

As much as Theo can be an utter and insufferable prick, on this score he's right. This was an insanely trivial error which has exposed who knows how many systems to potential breaches. Right now I'm starting up a full audit of our systems. We use OpenVPN for our interoffice WAN, as well as for clients; many of them Windows, iOS and Android clients, not to mention reviewing all our *nix clients running SSH daemons. We're only a relatively small operation, and it's still a monumental pain in the ass.

So, it's always great fun bashing "obvious" bugs, especially when they have such an impact, but let it be noted that thousands of implementers used openssl to build systems taking the package at face value despite these now "obvious" deficiencies in development process. If you were that concerned about security, they would have done what Google did, and audit the code. There are of course many practical reasons why people can't do that, but regardless, the blame arrow here points both ways.

People will keep writing bad code, this is unavoidable, but what automated tests could be run to make sure to avoid the worst of it?

Automated testing for security problems doesn't really work. Oh, you can do fuzzing, but that's hit and miss, and general unit testing can catch a few things, but not much. Mostly, security code just requires very careful implementation and code review. Eyeballs -- smart, experienced eyeballs.

OpenSSL has terrified me for years. The code is very hard to read and understand, which is exactly the opposite of what's desired for easy review and validation of its security properties. It needs to be cleaned up and made as simple, straightforward and accessible as possible, or replaced with something else that is simple, straightforward and accessible. My theory on why it is the way it is -- and it's far from unusual in the crypto library world -- is that cryptographers tend not to be good software engineers, and software engineers tend not to have the cryptography skills needed.

I spend some (not very much, lately) of my time working on an open source crypto library called Keyczar that tries to address one part of this problem, by providing well-engineered crypto APIs that are easy to use and hard to misuse. That effort focuses on applying good engineering to the boundary between library and application code, which is another source of rampant problems, but Keyczar uses existing crypto libs to provide the actual implementations of the primitives (the C++ implementation uses openssl, actually). I've long wished we could find crypto libs that were well-engineered, both internally and in their APIs.

My understanding is Theo said: Developers on a security product made a conscious decision to turn off last line of defence security for all platforms in the interest of performance on some systems. That does not sound like and unfortunate error to me, it sounds outright irresponsible.

He said that, but is that what happened? Were OpenSSL's developers aware that malloc()/free() have special security concerns that OpenBSD's developers had specifically addressed (I assume that's what meant by "a conscious decision to turn off last-line-of-defense-security")

I understand Theo's point, to a certain degree I kinda understand it, but I'm more inclined to feel the problem is with OpenSSL's developers clearly not understanding the security concerns about malloc(). That is, if they were aware that OpenBSD's malloc() contained code to ensure against data leakage, it would seem to me to be highly probable they would have implemented the same deal in OpenSSL given, you know, their entire point is security. The fact they didn't makes me think they didn't know OpenBSD's malloc() had these measures in the first place.

Should they have done? And how should they have known? Genuine question, and finger pointing would be inappropriate right now: how do we make sure that certain security strategies and issues are as well known as, say, stack pointer issues are today.

Theo's point isn't that OpenBSD users would have been safe. It's that had OpenSSL crashed on OpenBSD (or any OS with similar mitigation in place) it would have surfaced the bug much sooner...perhaps before a worldwide release. Once found it would have been fixed and merged upstream to benefit all users.

This is really a specific case of the larger point behind avoiding monoculture, whether OS or hardware. OpenBSD continues to support older architectures in part because it forces them to work through issues that benefit other platforms, especially the one most of us use daily: x86.

In addition, the mitigation countermeasures also prevent memory debuggers like Valgrind from finding the problem (Valgrind find use-before-init for malloc'ed blocks, but not if there is a wrapper in between that re-uses blocks), and may also neutralize code-security scanners like Fortify.

I have to admit that while my original intuition was "screwup", this looks more and more like some parts of the OpenSSL team have been compromised and did things that make this kind of bug far more likely. Like their own insecure memory allocation. Like not requiring time-of-use boundary checks or having any secure coding guidelines in the first place. Like documenting everything badly so potential reviewers get turned away. Like not having working review for patched or a working fuzz-testing set-up (which would have found bug this easily).

After all, the NSA does not have to sabotage FOSS crypto software. They just have to make sure the quality is low enough. The bugs they can exploit will follow. And the current mess is just a plain classic. Do enough things wrong and eventually stuff breaks spectacularly.

First, make sure that code that must be secure is transparent. That means little (or no) optimizations, standard calls to OS functions, and clearly structured. It's clear that the OpenSSL developers made their code more opaque than was prudent and the many eyes of open source land could not see through the murk. Yes, clearer code would mean that it ran more slowly and some folks would need to run a few more servers, but the security problem might have been uncovered sooner (or not have happened) if someone hadn't thought that performance was a reason to make the code more complex.

Second, formal independent review would have helped. Most code (especially in volunteer-based open source projects) is only vetted by people directly on the development team. Any piece of software as ubiquitous and critical to the operation of today's internet as OpenSSL cannot have verification and validation mainly by its own developers. For software like this, where security is critical, you should have external review. Start an independent project that vets these things, folks.

Third, understand the limits of testing vs. design. More unit tests would not have caught this. Simple designs lead to simple and correct implementations. Complex designs (or no designs) lead to seas of unit tests that simply tells you the ways that the code happens not to be broken at the moment. Code like that in OpenSSL ideally should be simple enough to be formally proved correct.

I think we've known about why these sorts of things happen ever since I entered he field thirty years ago. We have ways to prevent them, but they usually take time, money, or lowered performance. That they are still happening because of performance zealotry, bad process, and "teh web-speed is everything" mentality is a black mark on our profession.

it is a generally well regarded and vetted package that supports a fairly rich set of cryptography tasks out of the box.

I would see that as a drawback for using it in webservers: if I am writing something internet-facing I want to use the smallest and simplest possible library that does the job, maybe it would be time to fork openssl into openssl-core / openssl-extras and have openssl-core have only the most minimal set of functionality related to securing connections and that's it? I would honestly also only support a few platforms for -core to simplify the code analysis even more (the more ifdefs, the more possible issues)

A lot of large performance-sensitive projects implement custom allocators in the form of arenas and freelists. Lots of platforms have a fast malloc implementation these days, but none of them will be as fast as this for the simple reason that the program knows more about its memory usage patterns than any general-purpose allocator ever could.

Not to say I can't understand Theo's point of view -- if he wants maximum security, then a program which bypasses one of his layers in the name of performance might not be the best for him.

On the flip side, the standards have no notion of such security layers and I feel it is perfectly reasonable for a team to not throw away performance in the interests of some platform-specific behavior. This was a bug, pure and simple. There's nothing wrong with using custom allocators. To say that "OpenSSL is not developed by a responsible team" is simply nonsense.

A lot of large performance-sensitive projects implement custom allocators in the form of arenas and freelists. Lots of platforms have a fast malloc implementation these days, but none of them will be as fast as this for the simple reason that the program knows more about its memory usage patterns than any general-purpose allocator ever could.

This is security software. You don't sacrifice the library's core functionality to make it run a bit faster on the old Celeron 300 running Windows 98.

Well, what you are pointing out is that a CA is a single point of failure -- Something actual security conscious engineers avoid like the plague. What you may not realize is that collectively the entire CA system is compromised by ANY ONE of those single points of failure because any CA can create a cert for ANY domain without the domain owner's permission. See also: The Diginotar Debacle. [wikipedia.org]

The thing is, nobody actually checks the the cert chain, AND there's really no way to do so. How do I know if my email provider switched from Verisign to DigiCert? I don't, and there's no way to find out that's not susceptible to the same MITM attack.

So, let's take a step back for a second. Symmetric stream ciphers need a key. If you have a password as the key then you need to transmit that key back and forth without anyone knowing what it is. You have to transmit the secret, and that's where Public Key Crypto comes in, however it doesn't authenticate the identity of the endpoints, that's what the CA system is supposed to do. Don't you see? All this CA PKI system is just moving the problem of sharing a secret from being the password, to being which cert the endpoint is using -- That becomes the essential "secret" you need to know, and it's far less entropy than a passphrase!

At this time I would like to point out that if we ONLY used public key crypto between an client and server to establish a shared secret upon account creation, then we could use a minor tweak to the existing HTTP Auth Hashed Message Authentication Code (HMAC) proof of knowledge protocol (whereby one endpoint provides a nonce, then the nonce is HMAC'd with the passphrase and the unique-per-session resultant hash provides proof that the endpoints know the same secret without revealing it) to secure all the connections quite simply: Server and client exchange Nonces & available protocols for negotiation, the nonces are concatenated and HMAC'd with the shared secret stored at both ends, then fed to your key-stretching / key expansion system AND THAT KEYS THE SYMMETRIC STREAM CIPHER SIMULTANEOUSLY AT BOTH ENDS so the connection proceeds immediately with the efficient symmetric encryption without any PKI CA system required.

PKI doesn't really authenticate the endpoint, it just obfuscates the fact that it doesn't by going through the motions and pretending to do so. It's a security theater. SSL/TLS and PKI are essentially the Emperor's New Secure Clothes. At least with the shared secret model I mention above, there's just that one-time small window of PK crypto for secret exchange at worst (failing to intercept account creation means no MITM) and at best you would actually have the CHANCE to go exchange your secret key out of band -- Visit your bank in person and exchange the passphrase, etc. then NO MITM could intercept the data. HTTP Auth asks for the password in a native browser dialog BEFORE showing you any page to login (and it could remember the PW in a list, or even generate them via hashing the domain name with a master PW and some salt so you could have one password for the entire Internet). That's how ALL security should work, it ALL relies on a shared secret, so you want the MOST entropic keyspace not the least entropic selection (which CA did they use). If you're typing a password into a form field on a web page, it's ALREADY game over.

Do this: Check the root certs in your browser. For Firefox > Preferences > Advanced > Certificates > View. See that CNNIC one? What about the Hong Kong Post? Those are Known bad actors that your country is probably at cyber war with, and THEY ARE TRUSTED ROOTS IN YOUR FUCKING BROWSER?! Not to mention all the other Russian ones or Turkish, etc. ones that are on the USA's official "enemy" list. Now, ANY of those can pretend to be whatever domain's CA they want, and if your traffic bounces through their neck of the woods they can MITM you and you'll be n

In this case though, general unit testing should have caught the bug: There's an option at compile time which, if used, caused the affected versions of OpenSSL to crash. (Because it disables the bug, and OpenSSL was relying on it in one location...) So, good unit testing would have helped.

Basically, unit testing should be able to tell you if you've implemented the algorithm competently. It doesn't say if the algorithm is any good, just that your version of it works to the spec.

but if your explicit reason for overriding the kernel is to do something as-good-or-better than the kernel, and you fail to achieve that (let alone validate that you've achieved it), that's a rather spectacular failure. mistakes are common, sure. but this is sort of like saying the seatbelts that came with your car suck (they very well might), so you replace them with something "better", that ends up being little more than a piece of string that you daintily lay across your lap. kind of "unreasonable", no...?

Should they have done? And how should they have known? Genuine question, and finger pointing would be inappropriate right now: how do we make sure that certain security strategies and issues are as well known as, say, stack pointer issues are today.

Hell yes they should have known, because the people responsible for one of the most important security applications in the entire world damn well ought to be experts!

put the CSR on a USB stick, plug it in, and sign it from that machine. In this way you never have to worry about external threats potentially gaining access to your private key

Ever heard of Stuxnet? Connecting via sneakernet is much like connecting via a normal network: you need the machine itself to be secure. Sure, limiting connectivity limits attack vectors, and that helps. But it won't help against an attacker who understands your architecture, and targets you specifically.

Theo has been a strong proponent of hard hats/steel helmets, not tinfoil hats. Most people thought that approach was overkill for general walking around, at least up until Snowden showed there were lots of national actors firing lots of rocks into the air.