AEAD Encryption Concepts in Standard SQL

This topic explains the concepts behind AEAD encryption in BigQuery.
For a description of the different AEAD encryption functions that
BigQuery supports, see
AEAD encryption functions.

Purpose of AEAD Encryption

BigQuery keeps your data safe by using
encryption at rest. BigQuery also provides support for customer managed
encryption keys (CMEKs), which enables you to encrypt tables using specific
encryption keys. In some cases, however, you may want to encrypt individual
values within a table.

For example, you want to keep data for all of your own customers in a common
table, and encrypt each of your customers’ data using a different key. You have
data spread across multiple tables that you want to be able to
"crypto-delete". Crypto-deletion, or crypto-shredding, is the process of
deleting an encryption key to render unreadable any data encrypted using that
key.

AEAD encryption functions allow you to create keysets that contain keys for
encryption and decryption, use these keys to encrypt and decrypt individual
values in a table, and rotate keys within a keyset.

Keysets

A keyset is a collection of cryptographic keys, one of which is the primary
cryptographic key and the rest of which, if any, are secondary cryptographic
keys. Each key encodes an
algorithm for encryption or decryption; whether the key
is enabled, disabled, or destroyed; and, for non-destroyed keys, the key bytes
themselves. The primary cryptographic key determines how to encrypt input
plaintext. The primary cryptographic key can never be in a disabled state.
Secondary cryptographic keys are only for decryption and can be either in an
enabled or disabled state. A keyset can be used to decrypt any data that it was
used to encrypt.

In the above example, the primary cryptographic key has an ID of 569259624 and
is the first key listed in the JSON string. There are two secondary
cryptographic keys, one with ID 852264701 in a disabled state, and another
with ID 237910588 in a destroyed state. When an AEAD encryption function uses
this keyset for encryption, the resulting ciphertext encodes the primary
cryptographic key's ID of 569259624.

When an AEAD function uses this keyset for decryption, the function chooses the
appropriate key for decryption based on the key ID encoded in the ciphertext; in
the example above, attempting to decrypt using either key IDs 852264701 or
237910588 would result in an error, because key ID 852264701 is disabled and
ID 237910588 is destroyed. Restoring key ID 852264701 to an enabled state
would render it usable for decryption.

Encrypting plaintext more than once using the same keyset will generally return
different ciphertext values due to different
initialization vectors (IVs), which are chosen using the
pseudo-random number generator provided by OpenSSL.

Note: If you attempt to pass keysets in plaintext or parameters as part of
queries, the query text and query parameters may be logged, and with them the
plaintext keyset.

Advanced Encryption Standard (AES)

AEAD encryption functions use
Advanced Encryption Standard (AES) encryption.
AES encryption takes plaintext as input, along with a cryptographic key, and
returns an encrypted sequence of bytes as output. This
sequence of bytes can later be decrypted using the same key as was used to
encrypt it. AES uses a block size of 16 bytes, meaning that the plaintext is
treated as a sequence of 16-byte blocks. The ciphertext will contain a
Tink-specific prefix indicating the key used to perform the encryption. AES
encryption supports multiple block cipher modes.

Block cipher modes

Two block cipher modes supported by AEAD encryption functions are GCM and CBC.

GCM

Galois/Counter Mode (GCM)
is a mode for AES encryption. The function numbers blocks sequentially, and then
combines this block number with an initialization vector (IV). An initialization
vector is a random or pseudo-random value that forms the basis of the
randomization of the plaintext data. Next, the function encrypts the combined
block number and IV using AES. The function then performs a bitwise
logical exclusive or (XOR) operation on the result of the encryption and the
plaintext to produce the ciphertext. GCM mode uses a cryptographic key of
128 or 256 bits in length.

CBC mode

CBC “chains” blocks by XORing each block of plaintext with the previous block
of ciphertext prior to encrypting it. CBC mode uses a cryptographic key of
either 128, 192, or 256 bits in length. CBC uses a 16-byte initialization
vector as the initial block and XORs this block with the first plaintext block.

Additional data

AEAD encryption functions support the use of an additional_data argument,
also known as associated data (AD) or additional authenticated data.
Unlike the keyset, this additional data does not enable decryption of the
resulting ciphertext by itself. This additional data ensures the authenticity
and integrity of the encrypted data, but not its secrecy.

For example, additional_data could be the output of
CAST(customer_id AS STRING) when encrypting data for a particular customer.
This ensures that when the data is decrypted, it was previously encrypted using
the expected customer_id. The same additional_data value is required for
decryption. For more information, see
RFC 5116.

Decryption

The output of AEAD.ENCRYPT is
ciphertext BYTES. The
AEAD.DECRYPT_STRING or
AEAD.DECRYPT_BYTES functions can decrypt this
ciphertext. These functions must use a keyset that
contains the key that was used for encryption. That key must be in an
'ENABLED' state. They must also use the same additional_data as was used in
encryption.

When the keyset is used for decryption, the appropriate key is chosen for
decryption based on the key ID encoded in the ciphertext.

The output of AEAD.DECRYPT_STRING is a plaintext
STRING, whereas the output of AEAD.DECRYPT_BYTES is
plaintext BYTES. AEAD.DECRYPT_STRING can decrypt
ciphertext that encodes a STRING value;
AEAD.DECRYPT_BYTES can decrypt ciphertext that encodes a
BYTES value. Using one of these functions to
decrypt a ciphertext that encodes the wrong data type, such as using
AEAD.DECRYPT_STRING to decrypt ciphertext that encodes a
BYTES value, causes undefined behavior and may
result in an error.

Key Rotation

The primary purpose of rotating encryption keys is to reduce the amount of
data encrypted with any particular key, so that a potential compromised key
would allow an attacker access to less data.

Keyset rotation involves:

Creating a new primary cryptographic key within every keyset.

Decrypting and re-encrypting all encrypted data.

The KEYS.ROTATE_KEYSET
function performs the first step, by adding a new primary cryptographic key to a
keyset and changing the old primary cryptographic key a secondary cryptographic
key.