File It

Every Linux computer needs a filesystem, and users often choose a filesystem by habit or by default. But, if you're seeking stability, versatility, or a small performance advantage, it pays to take a closer look.

Most people would rather remember names than numbers. Computer filesystems evolved as a means for computers to interface with the idiosyncrasies of human memory. A filesystem deals with names, which are easier to recall than the underlying inode numbers the system uses to identify chunks of stored data.

Furthermore, a filesystem allows the user to attach special attributes to the file. Filesystems identify the file owner, for instance, the access rights, or the time of the last modification – regardless of whether the storage medium is a network device, a hard disk, or flash disk. A filesystem also hides all the physical properties and conditions of the media.

Put more generally, a filesystem creates an abstraction layer, which allows all layers above (e.g.,, applications) to work with names while the layers below (e.g., device drivers) work with physical addresses like inodes or block/sector numbers.

Filesystems comes in all shapes and designs. Beyond the general purpose filesystems discussed in this article, you will find read-only filesystems like cramfs or squashfs, as well as shared filesystems (sometimes also called cluster filesystems), which are simultaneously mounted on multiple servers. A filesystem that spreads its data across multiple storage nodes is called a parallel filesystem – GFS, Lustre, Ceph, or GlusterFS are examples of shared filesystems.

When you set up a new Linux computer or install a new storage disk to your system, you will be asked to choose a filesystem. This article compares some popular Linux filesystem options, including ext3, ext4, XFS, and Btrfs.

Ext3

The development of ext3 [1] started around 1998, when Steven Tweedie published a paper called "Journaling the Linux ext2fs Filesystem." At first, ext3 was only an extension to the robust ext2 filesystem. The ext3 filesystem was merged with the mainline Linux kernel with version 3.4.15 in November 2001. The main advantage ext3 has over ext2 is journaling. Journaling eliminates the need for a filesystem check after an unclean shutdown (unexpected power failure, system crash), except for certain rare hardware failure cases (e.g., hard drive failures). Additionally, ext3 is often faster than ext2 because ext3's journaling optimizes hard drive head motion. Ext3's reliability benefits further from its relative simplicity and a large number of installations, and because ext3 evolved from ext2, it offers a smooth migration path for data on ext2 disks. Ext3 is one of the most common filesystems in Linux, and it is still the default for many Linux distributions.

Ext3 also has several disadvantages. The filesystem check can be extremely slow for large filesystems. Hard scalability limits of 2TB for file size and 16TB for the size of the filesystem make ext3 unsuitable for many contemporary data center scenarios. The number of subdirectories is limited to 32000. Ext3 allocates space block by block, unlike some modern filesystems that allocate space by extents (larger contiguous areas of storage). Also, ext3 cannot manage delayed allocation, which helps to reduce disk fragmentation.

Ext4

Further development of ext3 has led to another entry in the ext series known as ext4 [2]. Ext4 was first developed as a series of backward-compatible ext3 extensions. However, the kernel developers eventually started refusing additional ext3 extensions for stability reasons. Instead, they encouraged a fork and the development continued in a new branch under a new name: ext4. Theodore Ts'o, the ext3 maintainer, announced the launch of ext4 in 2006. A first stable version was merged in the Linux 2.6.28 source code repositories in December, 2008.

Ext4 replaces the traditional block mapping with extends (up to 128MB of contiguous space). Allocating space using extends improves large-file performance. If a file contains more than four extends, the extends are indexed in a special B-Tree (called an H-Tree). Ext4 also supports delayed allocation. The system does not allocate blocks before writing data to disk, which minimizes fragmentation because the allocation can be based on the actual file size.

Ext4 does not limit the number of subdirectories, and the ext4 journal uses checksums, which improves reliability. The filesystem check in ext4 is able to skip unallocated blocks, which improves performance. Ext4 is backward-compatible with both ext3 and ext2, which allows easy migration, but Ted Ts'o doesn't see a major step forward. In his eyes, the ext filesystem family is "1970s technology" – he would prefer Btrfs.

Btrfs

Btrfs [3] (pronounced "Butter FS") was originally written by Chris Mason (formerly of Oracle). Btrfs is a new copy-on-write (CoW) filesystem that include many advanced features. The Btrfs filesystem is often regarded as a counterpart to Oracle's ZFS. Btrfs 1.0 was accepted into the mainline kernel in the summer of 2009. Since the summer of 2012, Btrfs has been included as a stable and supported component in some popular Linux distributions, such as SLES 11 and Oracle Enterprise Linux 5 and 6.

Btrfs has many useful features. You will find a built-in volume manager and RAID support for RAID 0/1/5/6/10 volumes. Btrfs also automatically creates checksums for data integrity and ensures, through its copy-on-write feature, that all or nothing is written to disk.

Btrfs can find errors and automatically fix them with redundant copies (a feature known as self-healing). Btrfs can make its own snapshots or clone volumes. It comes with its own compression and allows user space tools to use its very fast internal tree search algorithm. It is able to manage quotas (per subvolume) and out-of-band data deduplication (with userspace tools).

Btrfs is also SSD aware: The block discard feature supports wear leveling with TRIM (a way to inform an SSD device which blocks of data are no longer considered in use and can be wiped). At the same time, the copy-on-write design lets Btrfs avoid features that don't get along very well with wear leveling, such as a journal.

The Btrfs developers have announced a lot of astonishing features for the near and more distant future: In-band deduplication, improved on- and offline filesystem checks, encryption, the ability to handle swap partitions, and incremental backups. The only drawback of Btrfs is its youth: Compared with its competitors, Btrfs is quite new, which means it might be a little less stable than the well-tested alternatives. Also, the high complexity of Btrfs might lead to some minor performance penalties.

Although most Linux distributions today have simple-to-use graphical interfaces for setting up and managing filesystems, knowing how to perform those tasks from the command line is a valuable skill. We’ll show you how to configure and manage filesystems with mkfs, df, du, and fsck.