Tech —

Mac OS X 10.7 Lion: the Ars Technica review

Lion is no shrinking violet.

The state of the file system

The file system implementation is not something most Mac users think about—nor should they. But like any other part of an operating system, there's some expectation that it will improve over time. And like any piece of technology, there comes a point where incremental improvements are no longer sufficient and a fresh start is required.

Hopes were high for a new file system back in 2006 when Apple publicly declared its interest in a port of Sun's innovative ZFS file system. The next year, Sun's CEO announced that ZFS would be part of Mac OS X 10.5 Leopard—obviously without consulting Apple first.

In the meantime, HFS+ has certainly been incrementally improved. Apple has added support for metadata journaling, case sensitivity, access control lists, and arbitrarily extensible metadata. None of these additions changed the basic design of the file system, however. HFS+ is thirteen years old, and is itself an extension of the HFS file system which is more than twenty-five years old. The state of the art in file system design has advanced a lot since 1985.

But again, most people don't spend much time thinking about the file system. They think about files and folders, sure, but not the software that manages how the individual bytes are arranged on the storage device. My longstanding preoccupation with the nitty-gritty of file storage has often been met with indifference or even derision. "Who cares about a new file system?" ask the scoffers. "HFS+ works fine. It stores and retrieves my files just fine. What's the problem?"

In response to this sentiment, I'd like to offer some concrete reasons why HFS+ is long overdue for replacement. I believe that Apple understands these problems better than anyone, but that a series of unfortunate events has resulted in its next-generation operating system being hamstrung with a previous-generation file system for the past decade. Before discussing whether or not Lion makes any progress in this area, let's take a hard look at our old friend, HFS+.

What's wrong with HFS+

Software is written with certain target hardware in mind. When HFS was created, the top-of-the-line Macintosh came with an 800K floppy drive, the "high-end" storage offered by Apple was a 20MB hard drive the size of a lunchbox, and the CPU was from the Motorola 68000 family. Thirteen years later, HFS+ replaced HFS, the floppy disks were 1.44MB, and Apple's hard drives topped out around 6GB. Keep this context in mind as we consider the following details of HFS+'s implementation.

When searching for unused nodes in a b-tree file, Apple's HFS+ implementation processes the data 16 bits at a time. Why? Presumably because Motorola's 68000 processor natively supports 16-bit operations. Modern Mac CPUs have registers that are up to 256 bits wide.

All HFS+ file system metadata read from the disk must be byte swapped because it's stored in big-endian form. The Intel CPUs that Macs use today are little-endian; Motorola 68K and PowerPC processors are big-endian. (The performance cost of this is negligible; it's mostly just silly.)

The time resolution for HFS+ file dates is only one second. That may have been sufficient a few decades ago when computers and disks were slower, but today, many thousands of file system operations (and many billions of CPU cycles) can be executed in a second. Modern file systems have up to nanosecond precision on their file dates.

File system metadata structures in HFS+ have global locks. Only one process can update the file system at a time. This is an embarrassment in an age of preemptive multitasking and 16-core CPUs. Modern file systems like ZFS allow multiple simultaneous updates, even to files that are in the same directory.

The total number of blocks in an HFS+ volume is stored in a 32-bit value. With 4KB blocks, this allows for a maximum disk size of 17TB. That may sound huge to you now, but consider that it's only a sixfold increase over what we have today, and today's largest hard drives are, in turn, a sixfold increase over what we had in 2005. (Apple can, of course, increase the block size from 4KB to, say, 8KB, but you can only play that game so long.)

HFS+ lacks sparse file support, which allows space to be allocated only as needed in large files. Think about an application that creates a 1GB database file, then writes a few bytes at the start as a header and a few bytes at the end as a footer. On HFS+, slightly less than a gigabyte of zeros would have to be written to disk to make that happen. On a modern file system with sparse file support, only a few bytes would be written to disk.

Concurrency, metadata written in the correct byte order, sub-second date precision, support for massive volume sizes, and sparse file support are all common features of Unix file systems. Mac OS X, of course, is built on a Unix foundation. When HFS+ was ported from classic Mac OS to Mac OS X, it needed to be extended to support some minimum set of features that are expected from Unix file systems.

Some of those features were an easy fit, but others were very difficult to add to the file system without breaking backwards compatibility. One particularly scary example is the implementation of hard links on HFS+. To keep track of hard links, HFS+ creates a separate file for each hard link inside a hidden directory at the root level of the volume. Hidden directories are kind of creepy to begin with, but the real scare comes when you remember that Time Machine is implemented using hard links to avoid unnecessary data duplication.

Listing the contents of this hidden directory (named "HFS+ Private Data", but with a bunch of non-printing characters preceding the "H") on my Time Machine backup volume reveals that it contains 573,127 files. B-trees or no b-trees, over half a million files in a single directory makes me nervous.

That feeling is compounded by the most glaring omission in HFS+—and, to be fair, many other file systems as well. HFS+ does not concern itself with data integrity. The underlying hardware is trusted implicitly. If a few bits or bytes get flipped one way or the other by the hardware, HFS+ won't notice. This applies to both metadata and the file data itself.

Data corruption in file system metadata structures can render a directory or an entire disk unreadable. (For a double-whammy, think about corruption that affects the "HFS+ Private Data" directory where every single hard link file on a Time Machine volume is stored.) Corruption in file data is arguably worse because it's much more likely to go undetected. Over time, it can propagate into all your backups. When it's finally discovered, perhaps years later when looking at old baby pictures, it's too late to do anything about it.

In a recent study of 1.53 million disk drives over 41 months, Bairavasundaram et al. show that more than 400,000 blocks had checksum mismatches, 8 percent of which were discovered during RAID reconstruction, creating the possibility of real data loss. They also found that nearline disks develop checksum mismatches an order of magnitude more often than enterprise class disk drives.

Most of these studies concern themselves with enterprise-scale deployments, but personal storage use today is where enterprise storage was only a few years ago (in terms of capacity, if not throughput). And keep in mind that all of these issues only get worse as the data volume goes up—which it inevitably does, year after year.

It's rapidly becoming inexcusable for the storage systems we entrust with some of our most precious possessions—something we're activelyencouraged to do by Apple itself—to take such a cavalier approach to data integrity. The worst part is that there's little a user can do to make up for this technological gap; backups only serve to silently spread data corruption.

I'll stop here, but do note that I haven't even gotten to many of the other headliner features of modern file systems: constant-timesnapshots, transactional updates, data deduplication, and on and on. HFS+ has served Apple well, and probably for far longer than its designers ever imagined it would. But like all the other Apple-related products and technologies that fit this description (e.g., classic Mac OS, Carbon, PowerPC), there comes a time when things once treasured must pass from this world.

Share this story

John Siracusa
John Siracusa has a B.S. in Computer Engineering from Boston University. He has been a Mac user since 1984, a Unix geek since 1993, and is a professional web developer and freelance technology writer. Emailsiracusa@arstechnica.com//Twitter@siracusa