Around the same time, researchers discovered that Dropbox stores user files “unencrypted.” Dozens (hundreds?) closed their accounts in protest. They’re my confidential files, they cried, why couldn’t you at least encrypt them?

Many, including some quite tech-savvy folks, were quick to indicate that it would have been so easy to encrypt the data. Not encrypting the data proved Sony and Dropbox’s incompetence, they said.

In my opinion, it’s not quite that simple.

Encryption is easy, it’s true. You can download code that implements military-grade encryption in any programming language in a matter of seconds. So why can’t companies just encrypt the data they host and protect us from hackers?

The core problem is that, to be consumable by human users, data has to be decrypted. So the decryption key has to live somewhere between the data-store and the user’s eyeballs. For security purposes, you’d like the decryption key to be very far from the data-store and very close to the user’s eyeballs. Heck you’d like the decryption key to be *inside* the user’s brain. That’s not (yet) possible. And, in fact, in most cases, it isn’t even practical to have the key all that far from the data-store.

encryption relocates the problem

Sony needs to be able to charge your credit card, which requires your billing address. They probably need to do that whether or not you’re online, since you’re not likely to appreciate being involved in your monthly renewal, each and every month. So, even if they encrypt your credit card number and address, they also need to store the decryption key somewhere on their servers. And since they probably want to serve you an “update your account” page with address pre-filled, that decryption key has to be available to decrypt the data as soon as you click “update my account.” So, if Sony’s web servers need to be able to decrypt your data, and hackers break into Sony’s servers, there’s only so much protection encryption provides.

Meanwhile, Dropbox wants to give you access to your files everywhere. Maybe they could keep your files encrypted on their servers, with encryption keys stored only on your desktop machine? Yes… until you want to access your files over the Web using a friend’s computer. And what if you want to share a file with a friend while they’re not online? Somehow you have to send them the decryption key. Dropbox must now ask its users to manage the sharing of these decryption keys (good luck explaining that to them), or must hold on to the decryption key and manage who gets the key…. which means storing the decryption keys on their servers. If you walk down the usability path far enough – in fact not all that far – it becomes clear that Dropbox probably needs to store the decryption key not too far from the encrypted files themselves. Encryption can’t protect you once you actually mean to decrypt the data.

The features users need often dictate where the decryption key is stored. The more useful the product, the closer the decryption key has to be to the encrypted data. Don’t think of encryption as a magic shield that miraculously distinguishes between good and bad guys. Instead, think of encryption as a mechanism for shrinking the size of the secret (one small encryption key can secure gigabytes of data), thus allowing the easy relocation of the secret to another location. That’s still quite useful, but it’s not nearly as magical as many imply it to be.

what about Firefox Sync, Apple TimeMachine, SpiderOak, Helios, etc.

But but but, you might be thinking, there are systems that store encrypted data and don’t store the decryption key. Firefox Sync. Apple’s TimeMachine backup system. The SpiderOak online backup system. Heck, even my own Helios Voting System encrypts user votes in the browser with no decryption keys stored anywhere except the trustees’ own machines.

It’s true, in some very specific cases, you can build systems where the decryption key is stored only on a user’s desktop machine. Sometimes, you can even build a system where the key is stored nowhere durably; instead it is derived on the fly from the user’s password, used to encrypt/decrypt, then forgotten.

But all of these systems have significant usability downsides (yes, even my voting system). If you only have one machine connected to Firefox Sync, and you lose it, you cannot get your bookmarks and web history back. If you forget your Time Machine or SpiderOak password, and your main hard drive crashes, you cannot recover your data from backup. If you lose your Helios Voting decryption key, you cannot tally your election.

And when I say “you cannot get your data back,” I mean you would need a mathematical breakthrough of significant proportions to get your data back. It’s not happening. Your data is lost. Keep in mind: that’s the whole point of not storing the decryption key. It’s not a bug, it’s a feature.

and then there’s sharing

I alluded to this issue in the Dropbox description above: what happens when users want to share data with others? If the servers don’t have the decryption key, that means users have to pass the decryption key to one another. Maybe you’re thinking you can use public-key encryption, where each user has a keypair, publishes the public encryption key, and keeps secret the private decryption key? Now we’re back to “you can’t get your data back” if the user loses their private key.

And what about features like Facebook’s newsfeed, where servers process, massage, aggregate, and filter data for users before they even see it? If the server can’t decrypt the data, then how can it help you process the data before you see it?

To be clear: if your web site has social features, it’s very unlikely you can successfully push the decryption keys down to the user. You’re going to need to read the data on your servers. And if your servers need to read the data, then a hacker who breaks into the servers can read the data, too.

so the cryptographer is telling me that encryption is useless?

No, far from it. I’m only saying that encryption with end-user-controlled keys has far fewer applications than most people think. Those applications need to be well-scoped, and they have to accompanied by big bad disclaimers about what happens when you lose your key.

That said, encryption as a means of partitioning power and access on the server-side remains a very powerful tool. If you have to store credit card numbers, it’s best if you build a subsystem whose entire role is to store credit-card numbers encrypted, and process transactions from other parts of your system. If your entire system is compromised, then you’re no better off than if you hadn’t taken those precautions. But, if only part of your system is compromised, encryption may well stop an attacker from gaining access to the most sensitive parts of the system.

You can take this encryption-as-access-control idea very far. An MIT team just published CryptDB, a modified relational database that uses interesting encryption techniques to strongly enforce access control. Note that, if you have the password to log into the database, this encryption isn’t going to hide the data from you: the decryption key is on the server. Still, it’s a very good defense-in-depth approach.

what about this fully homomorphic encryption thing?

OK, so I lied a little bit when I talked about pre-processing data. There is a kind of encryption, called homomorphic encryption, that lets you perform operations on data while it remains encrypted. The last few years have seen epic progress in this field, and it’s quite exciting…. for a cryptographer. These techniques remain extremely impractical for most use cases today, with an overhead factor in the trillions, both for storage and computation time. And, even when they do become more practical, the central decryption key problem remains: forcing users to manage decryption keys is, for the most part, a usability nightmare.

That said, I must admit: homomorphic encryption is actually almost like magic.

the special case of passwords

Passwords are special because, once stored, you never need to read them back out, you only need to check if a password typed by a user matches the one stored on the server. That’s very different than a credit-card number, which does need to be read after it’s stored so the card can be charged every month. So for passwords, we have special techniques. It’s not encryption, because encryption is reversible, and the whole point is that we’d like the system to strongly disallow extraction of user passwords from the data-store. The special tool is a one-way function, such as bcrypt. Take the password, process it using the one-way function, and store only the output. The one-way function is built to be difficult to reverse: you have to try a password to see if it matches. That’s pretty cool stuff, but really it only applies to passwords.

So, if you’re storing passwords, you should absolutely be passing them through a one-way function. You could say you’re “hashing” them, that’s close enough. In fact you probably want to say you’re salting and hashing them. But whatever you do, you’re not “encrypting” your passwords. That’s just silly.

encryption is not a magic bullet

For the most part, encryption isn’t magic. Encryption lets you manage secrets more securely, but if users are involved in the key management, that almost certainly comes at the expense of usability and features. Web services should strongly consider encryption where possible to more strictly manage their internal access controls. But think carefully before embarking on a design that forces users to manage their keys. In many cases, users simply don’t understand that losing the key means losing the data. As my colleague Umesh Shankar says, if you design a car lock so secure that locking yourself out means crushing the car and buying a new one, you’re probably doing it wrong.

14 thoughts on “encryption is (mostly) not magic”

This is only a problem because users don’t want to type in their password every time they deal with data. This problem *is* easy to solve, it’s just that nobody wants the 10 seconds of inconvenience to type in a password / pass phrase every time they want to view / save / download / whatever their data. Companies, such as Sony, are similarly lazy. The reality is that we already know how to solve these sorts of problems, but users are lazy and want a “better” solution. Thus, instead of securing data or implementing secure access to data, implementers ignore the problem entirely.

Still, I think that it’s unfair to blame people implementing security measures as having to take on too tough of a job or being too lazy. The problem is that users are lazy and whiny, and this rubs off on the bottom line of security. There’s no harm in further research to better solve these problems, but let’s not pretend like we couldn’t solve them *right now* if users weren’t so adamant about not wanting to type a pass phrase every time they need to do something with their data.

Couldn’t we ask the user for his/her password when logging in and use that as a key (or input to a hash function which will yield a key) that will be stored in memory for the duration of the session on the server?

Of course this would prevent users from using automatic logins (a less secure workaround could be implemented using cookies…), but it could be a cheap price to pay for data security.

Another factor to weight is human error by IT professionals. Data is the life-blood of businesses and if removing a single key could accidentally lock services out of their customer data, it could end of life that business. Bam!

System architects have to be careful about where to put this key, disaster recovery, and if encryption is worth the risk feature by feature.

Sure, this could be “solved *right now*” by training all the users to memorize a 32-character passphrase and entering it on every request involving data. Oh, and require that the passphrase change every day. But, do you honestly think any living human who is not interested in security tech as a hobby—or whose literal survival depends on it daily—would ever put up with that?

Remember who you are working for: The users. You call me lazy and whiny—either literally to my face or through the demands your software places upon me—and I switch to your competitor’s more usable (and personally useful) product.

This isn’t a binary choice, and the propensities of “lazy and whiny” users (eg. actual human beings into whose lives you want your software incorporated) must be accounted for in any workable compromise.

Interesting writing. Thank you!
I wrote a complementary blog post focusing on online payment: http://wp.me/pSetq-4C and one idea of how payment could be made more secure. Without encryption, but with banks cooperation!

Keep in mind that not every device out there has a keyboard to ease the typing of a passphrase…

Just imagine entering a full passphrase with an on-screen keyboard on a
game console (for example) every time you want to access your data, it
is just impractical (plus, ideally we’d want the password to not be
displayed on screen, replaced by stars, right ? Good luck to catch
typing mistakes…).

I remember entering a WPA-key on a Wii one day, the interface was so
horrible it just took me something like five whole minutes…

There are also disabled users that may have (for example) typing
problems, needing much more time to enter a simple passphrase than “10
seconds”.

Plus, if you want to have something really secure, you need to have
different passphrases for different services, making it even harder to
manage and less usable…

First, we should go the extra mile and realize there is no such thing as a prototypical “user” that drives design… We should afford for different use patterns for different user types. For instance, we regard most users currently as “protectorate customers”, and as such assume that these folks are in need of simplicity and convenience for successful engagement. What we lack currently from a systems design perspective is the “sovereign user”, as I know you are aware. Sovereign users are willing and want more personal accountability for the structure of the system design they are enabling and using. That desire should be a focus, throughout the design of our Society’s technical systems, not a 2012 after-thought. Its existence may be leveraged on a network where identification matters, to the benefit of all participants, customers or sovereigns.