MySQL 5.6.17 – now with better encryption

Joro wrote recently about MySQL 5.6.17‘s new support for AES-256 encryption, and it’s a great improvement for people need to encrypt their data at rest. The new session block_encryption_mode variable controls what variant of AES is used, and for backwards compatibility, the default behavior remains 128-bit key length with ECB block cipher mode. If you’re happy with that level of encryption, nothing changes – your existing code will work the same on 5.6.17 as it has on earlier versions (note that users of statement-based replication will experience new warnings). There are good examples of how to use the new functionality in the updated public test cases, but this blog post is intended to help orient users to the new options as well.

block_encryption_mode

The block_encryption_mode session variable defines which which cipher, key length and block cipher mode will be used in the encryption/decryption routine. The cipher in the current implementation is always “aes”, and the key length can be 128, 192 or 256. The block cipher mode is well-described in Joro’s post, and the documentation lists the possible values, which vary based on underlying SSL library used:

Content encrypted with a 128-bit key cannot be decrypted when the block_encryption_mode is set to an incompatible key length (or block cipher mode). The following demonstrates this, using the 256-bit key encrypted value in trying to decrypt while block_encryption_mode is set to use 128-bit keys, and vice-versa:

That’s important in that it implies you’ll need to be intentional about how data is encrypted in your system. You may want to convert existing AES128-encrypted data to AES256 before changing the block_encryption_mode.

Block cipher mode

Joro does a good job explaining block cipher mode options. All but ECB mode require an additional argument in AES_ENCRYPT() and AES_DECRYPT() – the initialization vector. Using AES_ENCRYPT() or AES_DECRYPT() with block_encryption_mode set to a block cipher other than ECB will produce an error if the IV is not provided:

Practically, this means you’ll likely want to add a column to tables to store the IV used for each row.

Converting existing data

If you have existing data encrypted using AES128, you may want to convert it to use AES256. That’s can be a little tricky, since you have to decrypt using aes-128-ecb and re-encrypt using a different block_encryption_mode (and, potentially, an IV). Because the block_encryption_mode control is set at the session level, you’ll need some tools to do an in-place conversion (and nobody wants to decrypt their sensitive data to a temporary staging area). The following SQL stored function makes this possible:

Using these functions to toggle the block_encryption_mode variable, we can decrypt data that was encrypted using some other block_encryption_mode setting, or encrypt it with something other than the current session variable state:

That error means that the block_encryption_mode session variable has been set to a mode requiring an IV as the third argument. See the section above under the “Block cipher mode” header, where this exact condition is shown.

Good question – you’re basically asking whether the IV should be considered secret material like the key. It serves a different purpose, and it’s not necessary to keep the IV secret. If the key is kept secret, knowing just the IV won’t help attackers decrypt the encrypted content. So what’s the value of the IV? Used properly, it establishes semantic security (quoting Wikipedia):

… knowledge of the ciphertext (and length) of some unknown message does not reveal any additional information on the message that can be feasibly extracted.

What does this look like in practice? Here’s an example – with the same (or no) IV, encrypted values can be compared to see if they are the same:

Using the same (or no) IV, I can compare ciphertext values and deduce which plaintexts are the same. This is the principle which drives rainbow tables. Although use of rainbow tables to find the plaintext equivalents of ciphertext is made impractical by use of an encryption key, when the comparing ciphertext reveals which row values are equal, information about the plaintext values is leaked.

Say you are encrypting web account passwords and using no IV, and I have access to the encrypted values. I observe that 25 unique ciphertexts are used by 30% of your web users. I have no reason to believe your web users are a unique population, so I Google “most common passwords” and get a nice list of the top 25 passwords for 2014. Yes, many of these stupid passwords would be quickly broken in any brute-force attempt, but the point remains: People with access to just the ciphertext (and IV) should not be able to infer anything about the corresponding plaintext. Using an IV helps prevent that. Here is an example where the same input as above produces two very different ciphertexts because the IV is random: