This really isn't really the end of the story. As far as your web app goes, HTTP cookies can be just as or more sensitive than your SSL keys, and they also slop around in your web servers memory. This is one reason why we run SSL/TLS in the first place, after all. In many cases we really use TLS as a way to ensure application layer authentication. Confidentiality, in and of itself, is often not the primary concern. Do you care more about people accessing your Amazon account, and buying things in your name, or people seeing what you're buying? With your Amazon cookies, I can do the former.

So are we all going to jump back to pre-forked, multi-process Apache now, tack on a TLS slave daemon, and ignore gaping big holes in the application layer?

They are orthogonal issues. The point of separating private keys is to contain exposure. Heartbleed still would have happened, and all data could be exposed. But right now we not only have to deal with data leakage, but after we patch and fix the bug we have exposure due to the private keys potentially being leaked. We then have to get new certs signed and experience all sorts of additional certs. If the private keyvwas not leaked, then while we still have to deals with the security breach, we can at least avoid having to revoke and reissue all certs.

As long as forward secrecy is/was used then the impact on the individual user is more or less the same. Remember we're largely talking about active MITM.

In the short term your user is compromised whether it's a cookie, an AES key for the TLS session (which will presumably still have to be resident in the process sending you data), a credit card number in a POST request, or your certificate master key.

Anyone who can intercept my traffic in close to real time, and wishes to target me, is going to know I'm talking to amazon.com, IP x.y.z.f, and that that's where they should target their Heartbleed attack for a good stab at accessing my PHP session cookie or TLS session AES key.

There are some cases, like e-mail phishing, where this isn't the case of course... but then a redirection service would be sufficient to let me script an attack against many sites.

You're right, this doesn't solve 100% of the problem. If I could solve 100% I'd be creating a startup...

Cookies are remarkably sensitive, but they can be far more easily rotated. I can make sure that every cookie is rotated transparently every day or so and leave that running as a sensible background precaution. If we had infrastructure that let us renew our TLS keys every 24 hours or so, this wouldn't be such a big deal (it would still be a big deal, but not quite as bad as it is today). But TLS keys have an expiry of usually years.

> If we had infrastructure that let us renew our TLS keys every 24 hours or so

The sad thing is... we do. 24 hours is a bit much, but why not have a different certificate for each server? The whole point of a certificate chain is to give us the flexibility to issue and revoke certificates from lower down in the tree... of course most of us serfs don't get the privilege of using our own intermediates.

Oh... and we're repeating some of the same mistakes in DNSSEC. Looking at deploying DNSSEC I kept reading that the general idea of the KSK was to function as a long-term key, and the ZSK as a short term key, but I have yet to see a method of managing things with the KSK offline that isn't like pulling teeth. The latest BIND requires that both the KSK and ZSK private keys be resident on your primary nameserver when you switch on the "auto-dnssec" magic.

The technique of not having the keys available to the process that's dealing in external bits works really well for DNSSEC. There's a program called opendnssec which takes care of keys, rotating them, and .... accesses them via PKCS#11. So you can use Hardware Security Module, or a softhsm. Since it's opendnssec that's doing the rotation of keys, that can run as a different user than your DNS server, so the fact that softhsm runs as a shared library is less of an issue.

opendnssec unfortunately is a little... industrial strength.
It takes some time and consideration to configure unlike bind's "gimme the keys and I'll just take care of it for you" approach.

Key management is a major issue across the board, not just web servers. Even a theoretically unbreakable crypto will always have a weakness if the keys themselves are compromised. Stopping keys from being copied is a major challenge though, because anything you can do to truly protect them involves major hassle.

Think of the problems credit card processors deal with: Hiding the keys themselves from their own employees, so that getting a root password is not enough to be able to just take all the credit card information. You don't want the key in any filesystem, and you don't want the key in an easy to retrieve memory location. You end up with servers that require multiple people to boot up, as the keys only really appear when multiple people provide their own piece of the secret.

Eventually, enough security leads to the risk of data loss, as an error can make the keys become unrecoverable.

This is why we have to add security breach detection, and make recovering from a breach easy and having low consequences. Linus said that with enough eyeballs, all buts are shallow. With enough attackers, all systems are insecure.

If I was running a bank, I'd hopefully use a proper HSM. You ask it to generate a private key, you then ask it for the public key, get it signed into a cert, and use that. The HSM promises to never give out the private key to anyone (including the administrator), usually in a tamper evident way (if someone did manage to extract the key, you'd notice). Even if you have root on a machine that has an HSM plugged into it, you can't get the private keys out.

However, my personal webserver isn't a bank. Not everyone can justify spending this much money on a HSM to get this level of assurance. What I'm proposing is a simpler solution that isn't robust against sophisticated attacks (eg when the attacker manages to get root), but is far more robust to some classes of the common attacks we see today (where the attacker can read any memory/file that the webserver has permissions to see).

Mac OS X has something similar to this "Software HSM": the Keychain. You can put private keys in your keychain, and apps can use them for signing or encrypting, but they can't extract them. It's quite nicely implemented; when an app tries to access a key the first time, a dialog will pop up saying something like "Mail is trying to use key xyz for decryption. Do you want to allow?".

Of course, this requires using Apple's APIs, which are poorly documented and a pain in the neck even compared to OpenSSL. It's also not suitable for servers.

That wouldn't help when there is a bug that lets an attacker read your server's memory; you'd still need to reissue your certificates as a preventative measure because you couldn't guarantee that the bit of memory used by the software HSM hadn't been compromised.

Of course. If you can run code as the user, then solutions intended to protect against arbitrary memory read bugs don't apply.

That doesn't mean that the solution is worthless. It simply means that it doesn't cover an unrelated class of bugs.

Migrated to hardware-based tokens, or Intel SGX-protected software tokens, would extend the solution to cover the case of arbitrary code execution. That doesn't eliminate the value of the software-only solution.

Perfect security doesn't exist; the goal is to reduce the area of the attack surface. The class of attacks you're talking about is different than the class of attacks an out-of-process keyring protects against.

This proposal is very similar to Plan 9's "factotum" scheme (see http://qedragon.livejournal.com/99938.html for a nice explanation with reference to Heartbleed; factotum is similar to a generic ssh-agent or gss-proxy), except proposing that the daemon run as a separate user, which is a reasonable extra layer of security that deals with some remote-code exploits.

Is everyone falling into the trap of over-securing last week's security problem? Isn't this just like banning water bottles on planes after a failed liquid bomb attack?

Be careful that in our haste to secure the private keys, we ignore easier attacks. The article seems to gloss over an attacker hacking the web server, when in fact that gives them such powers that going on to grab the private key might not even be attempted.

OpenSSL isn't last week's security problem: The code didn't magically get better in a week, and all signs indicate that there are likely more serious issues in the library.

Looking past OpenSSL, C didn't magically become a safe language in a week, either; this approach guards against a real problem in C that is not limited to a single bug in OpenSSL: over-reading off the end of a valid buffer.

How easy it is to hack the server itself really ranges from super simple to extremely hard. Is your target is a home server exposed to the internet, admin'ed by someone who installed apache from following online tutorials? Sure, you're better off just getting root access on the server. However, if your target is a server run by properly trained people who live and breath security practices, your best bet is to use a subtle bug in their stack that escaped their notice, extra bonus if it leaves no audit trail (e.g., heartbleed). I don't think this post is meant for an audience of the former group, but for those of the latter group.

I work at a pretty security conscious company (this might be an understatement, we're pretty big on security), and even as a developer on the inside I'd have to get pretty creative to get access to our production servers.

Yup. But when you have a successful attack you should consider what alternatives you have to make sure that never happens again. You might dismiss them since their cost:benefit might not be favourable. If this works, I doubt many people are going to deploy it by default, since the cost:benefit doesn't pay off for them. But it might pay off for some other people who are really pissed off right now.

Wouldn't it make sense to lower the exposure by having the server only have access to its own ephemeral private key?

So instead of having the key to the hard to change site certificate on many vulnerable front-line servers, it rolls up a key and on boot sends a certificate signing request to a hardened internal system?

This would be ideal. One of the problems with heartbleed has been that while you can revoke your cert and mint a new one, browsers don't check CRLs so they'll continue to trust the old compromised cert.

However, I don't think X.509 supports the concept of CA certs being limited to signing only subdomains (could be wrong), and you have a large industry that prefers the status quo of you having to pay them for each cert you mint.

This ends up with ridiculous things like tying payment to the lifetime of the certificate, which allows for things like "2 year certs", which are obviously less secure than 2×1 year certs.

But having your server roll it's cert every 12 hours from a more secure cert elsewhere would be a very nice feature.

It would have to be time based, not boot based, unless you want to do key revocation for all the previous at-boot-time generated keys. But yeah, if you rotated keys once an hour or once a day, then if they got leaked the window for MITMing your customers would only be that long.

This is feasible in the current X509 public CA system, thanks to name and path length constraints. However, I don't know of any CAs which will issue restricted suitable certs for any sensible amount of money.

Probably because there's a small market for that? I've met technical people that don't even make their CSRs, they actually have their CA generate private+public keypair so they don't have to figure it out.

Also, CAs make more money if they can issue each leaf cert themselves. Some CAs don't even allow you to get multiple private keys signed (only one active at a time) without paying more.

I think the change would simply require servers to always send a certificate chain (up to at least the cert's most-proximate global-issuer CA) instead of just a cert. Which is pretty much what every web-scale site does already, to short-circuit OSCP lookups on intermediate CAs.

DNSSEC needn't be involved; you aren't determining whether the CA owns the domain it's issuing certs for at runtime. Instead, the parent-CA who issued the CA's signing cert determined that when they issued the cert. As long as each certificate in the sent chain both 1. checks out as signed by its parent, and 2. has a subject hierarchically below its parent's subject, you can be sure each CA in the chain did whatever it considers diligence before issuing certs to its child-CAs.

> (I'm not even sure you can load your openssh server key into ssh-agent, can you?)

Yes, actually, as of OpenSSH 6.3 you can. (I wrote most of the patch that added that feature.) However, even without doing that the OpenSSH server performs crypto operations in a separate process from the network-facing child process (unless you've disabled UsePrivilegeSeparation). The purpose of having the server talk to an ssh-agent was to allow keeping your host keys encrypted on-disk or loading them from a smart card.

Its also similar to an abstraction in Erlang. The crypto application is started and all processes defer to the crypto application for operations. I don't know of the implementation is as secure as the article describes, but the abstraction is simple and straightforward.

This seems like a good idea but this fixation on PKCS#11 seems strange. Why use a whole API when Apache and Nginx can just add a simple daemon with their own internal API to do this?

The same amount of security can probably be obtained by just launching a process on server startup to do this with sufficient isolation from the parent process. I believe OpenSSH does something along these lines to run most of its code as an unprivileged user. It's probably even possible to do this seamlessly based on the existing SSL config directives in apache/nginx requiring no more intervention from the sysadmin than upgrading to a newer version.

The major reason is that when your website becomes popular, and becomes more of a target, you can swap out the software hsm daemon with a more sophisticated hardware solution, if implemented properly, by just changing a pkcs11: url[1] to point at the new HSM.

PKCS#11 has a few irritants, but it's a fairly sensible API. and it's already implemented by many things (browsers, gnome-keyring, ssh, ...). OpenSSL, GnuTLS at least both support it via one mechanism or another, my only real complaint from the webserver side is that the configuration knobs aren't really plumbed through.

That would be an argument for supporting both. I fear that otherwise what you will end up with is that a minority of security conscious people will have HSM (actual hardware or software) and most others will just configure their Apache/nginx software as quickly as possible and get on with it. Having the basic config be more secure by spawning the soft-hsm itself sounds like a win.

An external standard (de facto or not) is always better when dealing with things like this than a internal API. If I want to reuse that daemon for other things I know what to expect and how to operate it indipendently.

I disagree it's always better. It's a tradeoff. Now to install apache securely you need to go and install another daemon under a different user and configure apache to use it. How many people will actually bother? If it's builtin and driven by apache itself you just need to upgrade apache. So if you want mass adoption of this solution in HTTP servers a builtin solution works best. If you want to be able to reuse this for your openvpn/xmpp/etc server a separate daemon is best.

The value, in theory, is that PKCS#11 already exists, is already supported by software and hardware vendors, and is already comprehensive enough and has seen sufficient review to cover the gamut of use cases that such a solution would need to support.

PKCS#11 is a little funny looking and has some small rough edges, but it's actually reasonably designed and easy to implement from scratch. That's not something I can say for many of the other PKCS standards.

It's apparently not supported by Apache/nginx nor does a suitable software-HSM exist to use it, so you're basically writing both ends of the communication. But if you do go with a separate daemon PKCS#11 may very well be a good solution. I just think forking off a process yourself is much cleaner for the use case of securing a web server.

If a server uses gnutls and passes the user-supplied filename directly to gnutls_certificate_set_x509_key_file2(), a PKCS#11 URL can be used directly without changes to the server.

> I just think forking off a process yourself is much cleaner for the use case of securing a web server.

It's something that everyone has to write for every server; people will get it wrong. Additionally, there's no support for hardware modules or plugging in new software security modules, so you'd be starting with a handicapped solution.

>Apache/nginx don't have to support pkcs11, they just need to support the use of existing crypto libraries that already support pkcs11

Fine you get Apache->SSLLib->PKCS#11. Now you need to write a PKCS#11 compliant library to talk to your HSM, and a custom serialization protocol for that communication anyway.

>It's something that everyone has to write for every server; people will get it wrong. Additionally, there's no support for hardware modules or plugging in new software security modules, so you'd be starting with a handicapped solution.

If we're worried about http servers it's basically apache/nginx. As I mentioned in another comment if apache/nginx implement this directly most users will get it by default. If they implement it with a separately configured daemon only very security conscious people will do it. So if your objective is preventing the most dangerous bugs in the most exposed daemons (and HTTPS tends to be that) in the most number of cases doing this directly by default in those two servers seems like a better solution. That doesn't stop you from also doing the other option to support actual HSMs and other fancy 1% cases.

>HSM modules already have PKCS#11 drivers, because it's a standard, and that means they work readily with existing software and cover the requisite industry use-cases.

The OP isn't talking about using an actual HSM, but using a new software based daemon to do the HSM role just so the crypto calculations (and the key) aren't in the webserver's address space. He confirms that he is indeed trying to write the PKCS#11 driver himself. Just using the existing crypto code in Apache/nginx but moving it into a separate process seems much cleaner to me and has the one feature this suggestion doesn't, it works with existing config files without modification so will be used much more widely. That's all I am saying.

The OPs proposal was for a daemon so it could be run as a different user. My suggestion was indeed to fork and figure out how to isolate the process (as forking may not be enough if you have permissions to do things like ptrace processes running under the same user).

>Apache could ship a fall-back PKCS#11 driver implementation that did this, transparently.

What you are proposing then is something different. It's to make the only crypto code path the PKCS#11 one and then make the normal case the special case of that. So you are going Apache->Gnutls/OpenSSL->custom PKCS#11 driver->fork->Gnutls/OpenSSL(to actually do the crypto). Since apache already has working code for the first and last steps you could just do Apache->fork->Gnutls/OpenSSL and be done with it. Your suggestion adds more complexity but improves the support for other more exotic PKCS#11 providers.

After reading your responses and claims of complexity and architecture, I don't think you understand PKCS#11, the problem domain, or the architectural constraints to a level that is commiserate with your expressed level of certainty.

I say this as someone who works on PKCS#11 code: It's not really possible to have a productive conversation with someone that is missing key domain experience and knowledge, but is so certain of their correctness anyway.

More concretely, a forked daemon only needs to support RSA and other crypto operations without revealing their keying material. They don't need a full TLS/SSL stack.

That said, there's absolutely no additional complexity in having both Apache and the hypothetical daemon using a full TLS/SSL crypto library. Any __TEXT pages will be shared, and duplicated __DATA and base-line library allocations are essentially zero.

>After reading your responses and claims of complexity and architecture, I don't think you understand PKCS#11, the problem domain, or the architectural constraints to a level that is commiserate with your expressed level of certainty.

I'm happy to learn. But all I am saying is that you're adding a PKCS#11 step to the call stack when you can just fork and use the existing code. That's a simple assertion, is it wrong?

>To be a bit more concrete, a forked daemon only needs to support RSA and other crypto operations without revealing their keying material. They don't need a full TLS/SSL stack.

I didn't say they did. I said that you had to implement some GnuTLS/OpenSSL code in apache to invoke the PKCS#11 operations, then implement your forking PKCS#11 driver that will then need to call GnuTLS/OpenSSL to do the crypto to return the PKCS#11 results.

>That said, there's absolutely no additional complexity in having both Apache and the hypothetical daemon using a full TLS/SSL crypto library. Any __TEXT pages will be shared, and duplicated __DATA and base-line library allocations are essentially zero.

The complexity I was referring to was in the code that you need to setup the call stack you are proposing. Obviously the gnutls/openssl .so will be shared.

I see where I've not explained myself properly. The existing code I'm referring to is the code that right now handles the TLS sessions in apache/nginx. That's the code I'm suggesting could be run from a forked process instead of in the main process. To need IPC to offload the RSA crypto you'd need to be doing Apache->TLS session code->fork->RSA operations. I'm saying you could do Apache->fork->TLS session code. Just run all your TLS sessions in a different process with the normal single process, no PKCS#11 GnuTLS/OpenSSL code. Is that not feasible?

To do that Apache needs some form of internal IPC to communicate its TLS sessions to the forked process. Maybe that's more complex than forking and doing IPC at the PKCS#11 driver level? Don't know.

> To do that Apache needs some form of internal IPC to communicate its TLS sessions to the forked process. Maybe that's more complex than forking and doing IPC at the PKCS#11 driver level? Don't know.

Yes.

Also, bear in mind that you can't just fork and continue running in modern software.

A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

You'd be forking at the start of the Apache launch before any connections so that shouldn't be much of an issue.

This construct has a much worse bug. It separates the TLS from the rest of apache so it protects against bugs in other parts of the server (HTTP parsing for example) but it doesn't separate the TLS session code from the crypto primitives, so it wouldn't protect against heartbleed. For that forking at the PKCS#11 boundary would be much safer.

Thinking about it a better OpenSSL patch than the Akamai one of protecting the memory with a different alocator would be to run the actual crypto in a different process with a well-defined IPC between that and the main library. That would give you much of the safety of a software HSM without any changes to Apache/nginx or any other TLS server.

Actually, thinking about it some more forking at PKCS#11 driver will not fix heartbleed completely. It will stop the key being recovered but will still allow you to recover passwords and cookies. To fix it completely you'd need forking at both ends, or just using apache in forking instead of event mode.

There are several softhsm's, they just share the address space with your frontline daemon which (IMHO) defeats the purpose.

While webserver's support for PKCS#11 is annoying, it's well supported by lots and lots of other stuff (usually client side stuff like ssh, browsers etc tho). You can get webservers to do PKCS#11 today, there are docs on how to do it. They usually start with "download the source, and run configure with this pile of options."

Isn't that just because PKCS#11 is an in-process API so not really meant for calls over a socket? So wouldn't you need to actually write a PKCS#11 compliant library to plug into the server, a software HSM and then some form of serializing protocol to talk between the two? Or is there a standard way to do PKCS#11 over a socket? A quick look at the spec made it look like a "here's how our struct's are packed" kind of standard.

Yup, that pretty much sums it up. I'm currently trying to figure out if dbus could be that serialisation since it takes care of a reasonable amount of the hard work for you. But I'm no expert on GObject, so slow going. (Also, I'm not sure that I'm the best person to be writing this... I don't really have that much security knowledge, I just spent a whole pile of time trying to figure out how to secure my (client) keys recently and wondered why we didn't do something sensible for server keys.

In that case why not skip the middle man and just implement enough soft-HSM for whatever Apache/nginx needs with a simple serialization protocol just for that? Emulating all of PKCS#11 sounds like a chore for very little gain.

Having looked at PKCS#11, I'm not sure what bits you could get away with not implementing. It does have functions for things like "get random bytes", which I guess you might not want, but that's just barely any code: (int get_random_bytes() { return CKR_NOT_SUPPORTED; }).

All the complexity in this proposal is the serialisation/deserialisation which is about the same amount of work if it's pkcs#11 or some custom thing.

Custom API:
Pro: Marginally simpler to implement.
Pro: If the webserver fork()'s it by default, then more users get the benefit for the case that you can read the webserver memory.
Con: Doesn't protect against attacks that can read files readably by the webserver.
Con: Becomes complicated when you want to move to a real HSM.
Con: Isn't reusable between webservers, let alone for your mail server, xmpp server, webbrowsers, ssh clients and so on.

Using PKCS#11:
Pro: Can start with a PKCS#11 softhsm running as a seperate user today, migrate to hardware HSM with little change tomorrow.
Pro: Reusable across multiple webservers, already usable by browsers and ssh clients.
Pro: A well defined, maintained, open standard with a wide variety of implementations that already exist.
Con: Slightly more complex than a custom protocol, but I'd argue that the custom protocol would grow to cover at least what PKCS#11 supports. I'm currently investigating using dbus for the protocol, so serialisation/deserialisation is mostly taken care of.

How much protection does this really give? If you manage to hack the web server, then you can quickly feed the HSM/software daemon unlimited amounts of chosen plaintext to encrypt. Would this make it possible to recover the private keys?

https://www.lorier.net/docs/tpm are my notes with experimenting with the TPM in my T530. The trick is that the TPM will protect itself fairly aggressively, so before you start turn off the laptop, unplug the power and battery (if possible), and on the FIRST boot after you put eveything back together, go into the BIOS and clear the TPM.
If the menu option isn't there, then you probably have to power everything off :)

Your content is often not in the webservers user, it's often stored in a SQL or NoSQL database somewhere. Various access controls can be applied there. But your right, unfortunately this isn't a 100% magic pixie dust solution to everything.

When you say "you can get new keys" which is true (although startssl appears to be the fly in this particular ointment), browsers don't validate CRLs, so the old keys are still just as valid as the new ones. Which makes getting new keys potentially worthless.

This is providing similar protections for your TLS keys to what your database server already applies.

I appreciate your contrarian position, but I don't think you've thought this out. The problem with "just getting new keys" is that there is no guarantee of detection of a key breach. So you might desperately need to get new keys, but not know it for months. Meanwhile, bad people have access to that all-important content.

Protecting content involves protecting keys. So to prioritize protecting content, you have to prioritize protecting keys.