We store user email addresses in our database, like many other websites. While we do take pride in the security measures in place, sometimes "just enough" is just not enough.

We've begun looking into a solution which would let us store the email addresses in an encrypted format, and retrieve them in a readable format, but - and here's the catch - outside the source of our PHP code. That is, should someone ever breach our servers and retrieve our source code, our database passwords and everything to do with our business intelligence, we would like him to still have utterly useless data without the encryption algorithm.

What would be the best way to go about this, and are there any readymade solutions? We were thinking something along the lines of a php or apache module, or even a little Go program - anything compiled which would execute very fast and would be unusable if stolen because of its compiled nature.

the decrypt function would to the actual decryption, but outside php. It could call the system command and execute a binary file, whatever. But whatever decrpyt returns is the actual usable email then.

This decrypted email can then be used as the recipient of a system-sent email, or as displayed contact information on a user's profile, etc, but if the user nicks the DB somehow, all he sees is the above gibberish.

Edit:
tdammers asked why this has to be outside PHP but callable from within PHP.
Because we use the actual values of the email addresses a couple thousand times per minute, different ones at that. So our web app needs fast access to the readable values at all times, but we need to be sure that should someone grab our source code somehow or the database itself, he won't have any usable values. This can be anyone from a hacker, to a part of the code becoming exposed due to a critical bug, or a disgruntled ex-employee 5 minutes after getting fired leaving us no time to revoke his access.

Sorry, I meant to write we store "email addresses", not "emails". Thanks for the heads up about the other sites. Should I wait for a moderator to move the question, or should I just re-post there?
–
SwaderDec 7 '11 at 0:06

1

Just becuase something is compiled doesn't mean it can't be de-compiled and examined. Compiled DOES NOT equal secure.
–
pipTheGeekDec 11 '11 at 9:50

I know, but it does equal "more secure than non-compiled", does it not?
–
SwaderDec 11 '11 at 13:37

2

a disgruntled ex-employee 5 minutes after getting fired leaving us no time to revoke his access -- your algorithm is wrong: first revoke access, then make the conversion to ex-employee
–
bstpierreDec 12 '11 at 20:58

6 Answers
6

Coming up with your own home brew algorithm for encrypting email is literally the worst mistake you could possibly make in cryptography. I am not being hyperbolic, it is in fact the worst mistake. The whole point of modern cryptography is that the algorithms are public so that they can be peer reviewed. I would never consider using a private algorithm for any reason. The key is what is secret, and this makes the system work.

In your case. The database can be compromised and the messages can remain secure if each client used their own asymmetric key pair. The email infrastructure just passes along the cipher text without any means of deciphering it. This is how Enigmail works. In fact I highly recommend using Enigmail, why reinvent the wheel?

Oh, sorry, I meant we store email addresses, not emails. I'll update the main question. We would like the addresses to remain secure, the contact info of our users. We don't store actual emails in the database, only the contact information.
–
SwaderDec 7 '11 at 0:05

The same idea applies. The idea would be then to generate a key pair using an administrator's password (ideally, one really good password which all admins know from outside the system) for any time a user's details need to be viewed. If you're looking to use that information within the system, say for sending out auto-generated emails, then well that's a bit (lot) more complicated to keep secure, as you have to store both sides of the key (or the password that made the key) for the system to decode the message.
–
Mike SDec 9 '11 at 14:03

There are two aspects to keep in mind: on one hand you encrypt the data so that it is protected "on disk" and on the other hand you must secure the access to this data.

DBs like Oracle (in the enterprise edition) have some addons to provide transparent data encryption. In effect you have your data encrypted on disk so that an attacker that gets access to the DB files has no cleartext.

The next step is to control access to this data. You can limit access to user accounts, IPs but if an attacker is able to control your application he can read your data. You could additionally impose query limits so that your application cannot read more than X records per hour. Further you should use logging and perhaps alert on query threshold values.

An alternative to transparent encryption was to implement a kind of web service that is your personal secure data store. Only this application must be allowed to be able handle the encryption. An application that needs an email has to query the service in order to access a cleartext email. You should place such a service on a separate machine.

First of all, absolutely, absolutely, do not try to invent your own algorithm. You gain nothing and you lose everything. Algorithms are easy to reverse-engineer, even in binary format. And by building your own crypto, you introduce the very real likelihood that your system can be cracked without any inside knowledge at all, because your crypto was so terrible.

Second, putting your secrets in binary form instead of script form is like writing down your password backwards so that the attacker won't be able to read it. If the attacker is at all determined (and clearly he is if he's bothering to go this far) then you've done nothing but inconvenience him a little. Really, this sort of extraction work is the "hello world" of the counter-security world.

Finally, what you're asking for is fundamentally problematic. You want a system that can automatically decipher your encrypted content, but you want to to be built such that if someone knows everything that the system knows, then they won't be able to do the same. That's not an entirely reasonable expectation.

But the situation isn't hopeless either; security is a tradeoff against convenience, and you can almost always gain more security by eliminating some amount of convenience and autonomy.

About the best security you're going to get is where all the crypto parameters are stored in a password-protected file which is never decoded to disk. Instead, every time you start up the system, you manually type in a password which is used to extract the operating parameters for the database directly into working memory. Only you have that password, only you can boot or reboot the servers, and no employee ever has to have access to that sensitive information. This way, if your entire system is stolen or copied by a disgruntled tech, he won't be able to
extract your extract your secrets or even boot up your system.
Also, you're always the one answering the pager at 3am. (sorry, convenience tradeoff)

Alternately, you can tie your encryption to a hardware token. Then the owner of the token owns the data. Again, no risk of copying, but also if someone steals that device, they've stolen your database.

hm. Since we have a cluster server setup in the US, and we're in Europe, and our cluster is managed remotely by a server admin from yet another country, this sort of setup makes it doubly complex. Add to the issue the fact that we now have a sysop level guy in our IT team who has SOME rebooting privileges (just for when something goes horribly awry and the original server admin isn't available instantly for some reason), the whole infrastructure seems to be working against us... FMI, though, how would one implement something like your solution above? The file that is never decoded to disk?
–
SwaderDec 12 '11 at 8:51

Swader: see also security.stackexchange.com/questions/9411/… -- When using PHP you could use some volatile storage for storing your password only in RAM (e.g. memcache). But take measures to protect this service.
–
mdoDec 12 '11 at 15:01

Could Truecrypt or similar approach be a solution? Locate MySQL database on such an encrypted volume, that will require no other changes (save actions to mount the encrypted volume on startup/dismount it after the MySQL daemon stops).

If intruder gets access to a working server, exchanging data with, say, a Web application, they still could be able to intercept the sensitive data, as I see it. However, if the server is shut down and stolen, Truecrypt will be good enough protection, if good enough passphrase has been chosen.

No, that won't work. Our emails list is frequently accessed (couple thousand times per minute on average) and updated, and our servers are in the USA while we're based in Europe, there's no fear of physical thievery. What we want is the ability to drag every email requested through a separate encryption/decryption algorithm. More details in the OP
–
SwaderDec 8 '11 at 11:53

Encrypt the email addresses using a standard, well-vetted algorithm. (So, you are storing encrypted email addresses in the database.)

Store the decryption key in a file on your server. (Not in the PHP source code. The PHP source code can open this file, read it, store the decryption key in memory, and use it from there.)

In this way, a developer or hacker who makes off with a copy of your source code cannot decrypt the email addresses. Moreover, the decryption key will not be stored in the source code or in the source control repository.

Of course, a hacker who breaks into your server and steals the contents of the database and the contents of the file containing the decryption key can still decrypt the email addresses. But this is unavoidable. In the threat model where the attacker can steal the entire contents of your database, and has full access to the system running the web application, you are hosed. If the web application can decrypt, then so can the hacker. Ain't nothing you can do about that; it's just a fact of life.

Right, that's not too shabby, yeah. I'll see what I can do in regards to that - since we access the DB extremely often to retrieve the emails, we might run into some literal speedbumps if we relied on PHP calling, parsing and using the file contents as a decryption key. Then again... if PHP would have to call an external script anyway, there might not be much difference. I'll definitely take a look at this, thanks.
–
SwaderDec 11 '11 at 13:39

1

@Swader, your PHP code only needs to open and read the keyfile once after boot (not on every HTTP request), and then it can cache the key in memory, so I would expect the performance impact to be minor.
–
D.W.Dec 11 '11 at 19:00

I know I am three years late to the party, but I like to point out something I missed in the answers for those finding this in Google.

Most "leaks" from hacked websites are the result of SQL Injection, and much less from a root-ed server where the hacker has full access to the raw database files. SQL injection is an attack technique where the hacker can manipulate the queries performed by the database, and have the normal website code render the pages.

That last part is crucial: your PHP code will render the pages as usual. This means that your email address decryption code will execute normally and helps the hacker decrypt the information. All the hacker has to do is manipulate which record is shown (e.g. WHERE id=123--)

Should the hacker have full access to your files, then he most probably also has access to the PHP code and key in memory.