Block Tech

The recent revelations about NSA spying have sparked renewed interest in data encryption. Encrypting at the file level is quick and easy, but if you're looking for an extra dose of protection, try encrypting the whole block device.

The two principal options for encrypting data are hardware based and software based. You can also use both options in combination, but that can be a little overkill – although, in the current climate, perhaps not.

Hardware-based encryption solutions require specialized hardware (see the box "Self-Encrypting Drive"). Software-based approaches, on the other hand, have three options for encrypting your data on a Linux system: (1) encrypting a single file, (2) encrypting a directory (with or without a virtual disk) or filesystem, and (3) encrypting a physical block device.

Self-Encrypting Drive

One popular option for hardware-based encryption is a Self-Encrypting Drive (SED). The concept of an SED is simple: Take an ordinary drive, add an encryption/decryption processor to it, add authentication to the firmware, and you have an SED. This approach has several benefits:

The encryption is always on, so it will affect data at rest (i.e., stored on the drive).

Authentication is independent of the operating system (OS).

There are no encryption keys to manage (vendors use standard interfaces such as the BIOS or a software-based component that happens before the OS boots).

The encryption keys never have to leave the drive.

An SED relative to a non-SED shows no loss in performance.

Typically, the keys are 128- or 256-bit Advanced Encryption Standard (AES) keys, which are generally considered to be fairly strong encryption algorithms.

Managing SEDs can be a little more difficult than non-SEDs, because when the system boots, you need to authenticate so that the drives can then be used. In the case of a large distributed system, this can be a little cumbersome if the systems restart or boot regularly. Some vendors will keep the keys on an out-of-band device so that the drives can contact the device for them, but then you have to ensure the keys on the device are encrypted as well.

SEDs have some vulnerabilities, but most involve having physical access to the drives.

Encrypting files is fairly straightforward, and several tools are available for doing so, such as bcrypt, NCrypt, and 7-Zip, which can compress and encrypt files using 256-bit AES. The most popular tool is probably GnuPG, which comes with just about every Linux distribution.

Encryption at the filesystem level usually occurs through a stacked filesystem, such as EncFS or eCryptfs, which provides an encryption layer "stacked" above a lower level conventional filesystem.

Encrypting at the file or filesystem level provides maximum flexibility; however, encrypting an entire block device also has benefits. Block encryption essentially rolls all the data into a single encrypted entity. After the encryption is configured and in place, you use the block device as you would any unencrypted block device, including building a filesystem on it. Because the block device is encrypted, the data in the filesystem will be encrypted as well.

One of the biggest advantages of block encryption is that it covers all data on the device, so you only have to configure it once for all users or all data. Block encryption also means that not just file data is encrypted; the file names, directories, free space, and metadata are all encrypted as well.

In this article, I examine some tools for implementing the powerful privacy technique known as block encryption in Linux. However, if you decide to try these tools, just remember that if the key or passphrase is forgotten, all of the data on the block device will be lost.

DMCrypt

The device mapper (dm) is one of the most powerful tools in Linux. It is a framework for mapping one block device onto another. DMCrypt [1] is part of the dm infrastructure, and it uses the cryptographic routines that are included in the kernel's Crypto API [2]. DMCrypt is a dm target that can be stacked on top of other dm transformations. A DMCrypt target appears as a block device, giving you lots of flexibility within Linux. For example, you can implement DMCrypt under LVM or on top of LVM. This flexibility allows you to encrypt entire disks, partitions, software RAID volumes (dm can also do software RAID), and even files.

Using DMCrypt, you can use dmsetup to configure the mapping. An example is demonstrated in the "Configuration with dmsetup tool" section on the DMCrypt Google Code page [1]. It's not easy to do by hand and requires some pretty good dm skills. Many people use something called the cryptsetup tool, which basically uses DMCrypt with LUKS (Linux Unified Key Setup) and a management tool, oddly called cryptsetup. LUKS [3] is the standard for Linux drive encryption. It provides a standard on-disk format that greatly eases key management on Linux. All necessary setup information is in the partition header, which means the data travels with the physical media.

Before using DMCrypt, realize that you are going to encrypt an entire block device. If you have anything on that block device, you will lose it. You've been warned. To test encrypting, I used an Intel X25-E SLC SSD (64GB) with a single partition, /dev/sdb1 (Listing 1).

For the configuration of DMCrypt, I highly recommend cryptsetup. The first step is to configure the LUKS partition (Listing 2). This command initializes the volume for you and also takes your passphrase. Articles on the web give tips for choosing passphrases. For example, the Wikipedia article [4] lists the following:

For better security, any easily memorable encoding at your own level can be applied

Not reused between sites, applications, and other different sources

Given some of the recent security issues, you should definitely think about a good passphrase and practice it before using it in production. Practicing is important because once you type the passphrase for the encrypted block device, you can't recover it. The only thing you can do is reinitialize the volume with a new passphrase, meaning you lose the data.

The next step is to create the mapping for the volume and look at the device mapper (Listing 3). Notice that when creating the mapping, you have to enter the passphrase (repetition, repetition, repetition). I chose to name the mapper data, but you can use anything you want. I don't recommend necessarily calling it encrypted, because that could be a tip-off to someone interested in your data.

The second command (ls -l /dev/mapper/data) shows the mapping, which in this case is to dm-0. The third command (cryptsetup -v status data) outputs the status of the mapped volume, including the cipher used, the key size, the device, the number of sectors, and the mode for the volume. This command is very useful during this configuration stage.

You can also "dump" the LUKS headers with the cryptsetup command (Listing 4), which gives you a bit more information than the cryptsetup status option.

Now you're ready to create the filesystem; however, you first have to initialize the volume (Listing 5) by filling it with zeros, which, because the volume is encrypted, will fill the volume with random data. Once this is done, you can build the filesystem. I'll use ext4 for this example (Listing 6), but it could be any supported filesystem.

After mounting the filesystem at /data (Listing 7), I'll create a subdirectory for a user, /data/laytonjb, chown it to user laytonjb, and create some files as that user (Listing 8). The encrypted filesystem behaves just like an unencrypted filesystem: You can create files, cat their content, and use symlinks.

If you try mount /dev/sdb1 /data, it will complain that it's an unknown filesystem type. Although this provides a little bit of security, it also tells the person that the filesystem is encrypted using DMCrypt, which provides more information than was known before.

TrueCrypt

TrueCrypt [5] is a very popular data encryption tool for both Linux and Windows that allows you to encrypt a block device, including partitions, and encrypts files on the fly so they are encrypted just before they are written to disk.

TrueCrypt definitely has some interesting features (see the box "TrueCrypt Features"), and the plausible deniability feature is nice if you are asked to decrypt your drive. To demonstrate, I'll set up an encrypted partition (drive) on the same drive used for DMCrypt (/dev/sdb1).

TrueCrypt Features

TrueCrypt supports a number of features, including:

Uses hardware-based encryption tools such as Intel's AES-NI features to accelerate encryption

Uses a number of encryption algorithms, as well as combinations of algorithms (cascaded together)

Runs in parallel (uses the cores on the server, not cores on other servers)

Pipelined write/read operations (better performance)

Has an open source license (according to the Wikipedia entry for TrueCrypt [6]) but hasn't been officially approved by the Open Source Initiative [7]; however, there is a comment that it might meet the criteria

Has plausible deniability – If you are forced to reveal your passphrase, TrueCrypt provides and supports two kinds of plausible deniability:

+ Hidden volumes and hidden operating systems

+ No real signature that the volume is encrypted (see the Wikipedia entry on this)

TrueCrypt is available in two versions: graphical and console. I tested both versions, and they are both very easy to use, but I prefer to use the console version in case of problems with X. Once the console version is downloaded and un-tarred, you see the following:

Notice in the last few lines that a few files are installed. The primary binary is usr/bin/truecrypt, which I'll use for the rest of the steps, the first of which is to create the mapping and volume (Listing 10).

A number of steps happened in this single command: selecting the encryption and hash algorithms, selecting the particular filesystem you will be encrypting (in this particular version I used ext4), inputting the passphrase (remember the passphrase guidelines), and creating the keyfile [8].

A keyfile is optional (I chose none), but if you want, you can enter a keyfile path. Regardless, the installation script will make you to enter 320 random characters, which takes much more time than you think.

Once the key has been entered, TrueCrypt then creates the filesystem, which you then have to mount using the truecrypt command. In Listing 11, I mount it on /data.

Notice that TrueCrypt handles the mapping and mounting steps in one command, as compared with DMCrypt, which needs two commands. This difference is not necessarily a big deal, but you do have to remember two commands instead of one.

As I did for DMCrypt, to illustrate that the encrypted filesystem works as expected, I create a subdirectory for a user, chown it to user laytonjb, and create some files as that user (Listing 12). Again, the filesystem allows the normal functions, and the data appears unencrypted. To unmount a TrueCrypt-encrypted filesystem, you have to use the truecrypt command again (Listing 13).

Parting Comments

Both DMCrypt and TrueCrypt make it easy to encrypt a block device. Any data written to it is automatically encrypted. If this block device contains a filesystem for something like /home that has a large number of users, then all data for all users using /home will be encrypted automatically. When the filesystem is mounted, if someone has access to the system, they can read and write to the devices. Alternatively, someone can copy the data from the encrypted to an unencrypted filesystem, which removes any data protection/encryption. However, if the drives are removed or the system is halted and then restarted, you have to know the passphrase to mount the device. This is the protection that block device encryption provides.

Both DMCrypt and TrueCrypt transparently encrypt data, but they do this only while the block devices are mounted. If someone has access to the system while the block devices are in use, they can potentially access the data – the encryption doesn't really provide any protection. If users can access data to which they have no rights, however, you have other security problems. When the system is powered off or the drives are removed from the system, the data is encrypted and protected.

Every time the filesystem is mounted, an administrator has to type in the passphrase on the system that owns the device. You also have to find an effective and secure way to share the passphrases with other administrators on the system. If you are using several devices as part of a filesystem, you will have to type in the passphrase for each device: Consider the case of a large HPC filesystem with 1,000 drives. Can you imagine typing in 1,000 passphrases? You might be tempted to use the same passphrase for each device, but that isn't a really secure approach. Again, you need to find some way of automating things in a secure manner.

Other administration tasks should work the same as on an unencrypted filesystem, but you have to be careful about a couple of things. In general, performing a filesystem check (fsck) or taking a snapshot should behave in the same way with no changes. The same commands will work just fine unless you need to unmount the filesystem or remount it as part of the process. Backups should also work fine, but remember that when the data is written to tape, it will be gibberish, indicating it's encrypted. You will have to do a data restore from a backup to the same device; otherwise, the data will be useless. The last fundamental thing you need to remember about block encryption is that unmounting the filesystem is a bit different from the usual technique of just entering umount – either involving more than one step or using a different command.

At this point, you might ask yourself which encryption methods you should use. Should you use SEDs, or an encrypted block device, or an encrypted directory, or perhaps encrypted files, or all of the above? The paranoid among us might be tempted to use all four methods, but that is a very large number of passphrases to remember, and if you forget one, you can't access your data.

This is one of the dangers of encryption: If you forget the passphrase, you can't recover the data. Even if you made a copy or a backup, you still can't recover the data because the copy or backup is encrypted, and you need the passphrase to decrypt it. The only way you can recover the data is to remember the passphrase. Even taking the disk to a data recovery service will only recover the encrypted data – you still have to remember the passphrase to decrypt the data.

Buy Linux Magazine

Related content

Modern installers offer the option of encryption with just a few clicks, but you might want to take control of the process. We show how to encrypt your partitions safely without sacrificing convenience.

When a computer is lost, your data falling into the wrong hands is often more serious than the loss of hardware. In this article, we explain how to use LUKS and ZFS to encrypt a system so you can keep your privacy when you lose your laptop.