1 Answer
1

Free space and used space look exactly the same to someone who only sees one version of the ciphertext.

First, the basic idea of a secure block cipher is that you learn nothing about the plaintext block simply by observing the ciphertext block. You may be able to learn something about the plaintext from the surrounding context, such as by collecting more ciphertext blocks, but nothing about the ciphertext block itself yields information about the plaintext block.

Truecrypt uses block ciphers in XTS mode. Popular modern block ciphers are just a deterministic mapping between an input 16 bytes and an output 16 bytes, but XTS mode allows a normal block cipher to work with an additional input "tweak" value that affects the encryption. The tweak input describes the position of that block within the volume, and each one is unique. Since this tweak value affects the final ciphertext, a plaintext $P$ will encrypt differently if it is at position $i$ (and is thus tweaked by value $i$) or position $j$ (and is thus tweaked by value $j$) within the volume. So all unused space on the volume would encrypt to a different ciphertext for every block, even if they are all the same plaintext block value, such as 0. So no pattern of unused space would emerge in the ciphertext since, for all effective purposes, you would be encrypting different plaintext at every single spot. You can perform random access on XTS, as each ciphertext block is stand-alone. (However, for performance reasons, ciphertext is probably always read and decrypted in full sectors. True random access is not very efficient at the block disk level.)

XTS isn't unique in this regard, but it's is the most popular mode of operation in this sort of context right now. Other modes of operation like CBC can also provide similar security. In CBC mode, the plaintext is divided into sectors (usually 512 bytes to 4K in size, about the size of a sector) and each block is encrypted using a unique IV as input that aids the chaining process that you are familiar with. The chaining ensures that each plaintext block is encrypted differently in different contexts, and the uniqueness of the IV starts each chaining sequence uniquely. CBC doesn't have true random access, on reads you must read the previous block of ciphertext in the sector for use in decryption, and on writes you must re-encrypt the entire sector after the modified block. But since the sector sizes are small, it's "close enough" to random access.

(That is all a gross simplification, designed primarily to give an intuitive notion as to how used and unused space is treated.)

Ciphertext corruption is a minimal concern. In XTS mode nothing is chained, so only corrupted blocks become unusable. In chaining modes, generally only one other block becomes corrupted. For example, in CBC any given ciphertext block is only used in at most one other plaintext block encryption, so a damaged ciphertext block only breaks it's own plaintext and the other plaintext block it was chained with. (Stare at the CBC Wikipedia diagram until that makes sense.)

That describes the cryptographic solution, but there is a semi-complimentary ad-hoc solution as well. It takes advantage of a details of the filesystem. All blocks of the plaintext are encrypted, regardless of whether or not the filesystem stores any data in that block, and the filesystem doesn't care about the contents of unused blocks. So if each unused block has random, uncorrelated plaintext in it, then their ciphertexts will be uncorrelated. The attacker wouldn't know if those blocks had legitimate plaintext in them or just random unused data. This method should not be required for security, but it does add a nice buffer layer of security to mitigate a potential weakness in the algorithm or failure in the implementation.

Truecrypt uses that technique too. When you create a volume they give you the option of performing a "full" format in the beginning. This full format writes random data to every single sector of the encrypted volume. As you use the volume the filesystem simply overwrites the random data as it needs to use new sectors and ignores the random data it doesn't need to overwrite.

However, things change when an attacker can take multiple looks at the volume at different points in time. If they can see the volume ciphertext and take a snapshot of it, then later come back and take another snapshot of the ciphertext, they could compare the two and might be able to estimate what changed and what space is used/unused, depending on the activity that had happened on the volume in that period of time. But that becomes highly dependent on what the attacker knows about the structure of the plaintext, how much of it changes, etc, and is not relevant to an attacker who only gets too see the ciphertext at one point in time.

"However, things change when an attacker can take multiple looks at the volume at different points in time." That's why the TrueCrypt authors discourage backing up a volume by copying the container.
–
CodesInChaosMay 8 '12 at 8:48

1

Actually, for CBC decryption, it is enough to decrypt one block before the one you want to access, not the whole sector. (For write access, you would need to re-encrypt the whole rest of the sector after the changed block, though.)
–
Paŭlo Ebermann♦May 8 '12 at 19:32

@Paŭlo Ebermann: Thanks, good catch. And we don't even need to decrypt the previous block, we just need to read it. I've edited the post to correct it.
–
B-ConMay 9 '12 at 0:17