Tech —

Mac OS X 10.6 Snow Leopard: the Ars Technica review

No new features.

Installation

Apple claims that Snow Leopard's installation process is "up to 45% faster." Installation times vary wildly depending on the speed, contents, and fragmentation of the target disk, the speed of the optical drive, and so on. Installation also only happens once, and it's not really an interesting process unless something goes terribly wrong. Still, if Apple's going to make such a claim, it's worth checking out.

To eliminate as many variables as possible, I installed both Leopard and Snow Leopard from one hard disk onto another (empty) one. It should be noted that this change negates some of Snow Leopard's most important installation optimizations, which are focused on reducing random data access from the optical disc.

Even with this disadvantage, the Snow Leopard installation took about 20% less time than the Leopard installation. That's well short of Apple's "up to 45%" claim, but see above (and don't forget the "up to" weasel words). Both versions installed in less than 30 minutes.

What is striking about Snow Leopard's installation is how quickly the initial Spotlight indexing process completed. Here, Snow Leopard was 74% faster in my testing. Again, the times are small (5:49 vs. 3:20) and again, new installations on empty disks are not the norm. But the shorter wait for Spotlight indexing is worth noting because it's the first indication most users will get that Snow Leopard means business when it comes to performance.

Another notable thing about installation is what's not installed by default: Rosetta, the facility that allows PowerPC binaries to run on Intel Macs. Okay Apple, we get it. PowerPC is a stiff, bereft of life. It rests in peace. It's rung down the curtain and joined the choir invisible. As far as Apple is concerned, PowerPC is an ex-ISA.

But not installing Rosetta by default? That seems a little harsh, even foolhardy. What's going to happen when all those users upgrade to Snow Leopard and then double-click what they've probably long since forgotten is a PowerPC application? Perhaps surprisingly, this is what happens:

Rosetta: auto-installed for your convenience

That's what I saw when I tried to launch Disk Inventory X on Snow Leopard, an application that, yes, I had long since forgotten was PowerPC-only. After I clicked the "Install" button, I actually expected to be prompted to insert the installer DVD. Instead, Snow Leopard reached out over the network, pulled down Rosetta from an Apple server, and installed it.

No reboot was required, and Disk Inventory X launched successfully after the Rosetta installation completed. Mac OS X has not historically made much use of the install-on-demand approach to system software components, but the facility used to install Rosetta appears quite robust. Upon clicking "Install," an XML property list containing a vast catalog of available Mac OS X packages was downloaded. Snow Leopard uses the same facility to download and install printer drivers on demand, saving another trip to the installer DVD. I hope this technique gains even wider use in the future.

Installation footprint

Rosetta aside, Snow Leopard simply puts fewer bits on your disk. Apple claims it "takes up less than half the disk space of the previous version," and that's no lie. A clean, default install (including fully-generated Spotlight indexes) is 16.8 GB for Leopard and 5.9 GB for Snow Leopard. (Incidentally, these numbers are both powers-of-two measurements; see sidebar.)

A gigabyte by any other name

Snow Leopard has another trick up its sleeve when it comes to disk usage. The Snow Leopard Finder considers 1 GB to be equal to 109 (1,000,000,000) bytes, whereas the Leopard Finder—and, it should be noted, every version of the Finder before it—equates 1 GB to 230 (1,073,741,824) bytes. This has the effect of making your hard disk suddenly appear larger after installing Snow Leopard. For example, my "1 TB" hard drive shows up in the Leopard Finder as having a capacity of 931.19 GB. In Snow Leopard, it's 999.86 GB. As you might have guessed, hard disk manufacturers use the powers-of-ten system. It's all quite a mess, really. Though I come down pretty firmly on the powers-of-two side of the fence, I can't blame Apple too much for wanting to match up nicely with the long-established (but still dumb, mind you) hard disk vendors' capacity measurement standard.

Snow Leopard has several weight loss secrets. The first is obvious: no PowerPC support means no PowerPC code in executables. Recall the maximum possible binary payload in a Leopard executable: 32-bit PowerPC, 64-bit PowerPC, x86, and x86_64. Now cross half of those architectures off the list. Granted, very few applications in Leopard included 64-bit code of any kind, but it's a 50% reduction in size for executables no matter how you slice it.

Of course, not all the files in the operating system are executables. There are data files, images, audio files, even a little video. But most of those non-executable files have one thing in common: they're usually stored in compressed file formats. Images are PNGs or JPEGs, audio is AAC, video is MPEG-4, even preference files and other property lists now default to a compact binary format rather than XML.

In Snow Leopard, other kinds of files climb on board the compression bandwagon. To give just one example, ninety-seven percent of the executable files in Snow Leopard are compressed. How compressed? Let's look:

Yikes! What's going on here? Well, what I didn't tell you is that the commands shown above were run from a Leopard system looking at a Snow Leopard disk. In fact, all compressed Snow Leopard files appear to contain zero bytes when viewed from a pre-Snow Leopard version of Mac OS X. (They look and act perfectly normal when booted into Snow Leopard, of course.)

So, where's the data? The little "@" at the end of the permissions string in the ls output above (a feature introduced in Leopard) provides a clue. Though the Mail executable has a zero file size, it does have some extended attributes:

Ah, there's all the data. But wait, it's in the resource fork? Weren't those deprecated about eight years ago? Indeed they were. What you're witnessing here is yet another addition to Apple's favorite file system hobbyhorse, HFS+.

The presence of the com.apple.decmpfs attribute is the first hint that this file is compressed. This attribute is actually hidden from the xattr command when booted into Snow Leopard. But from a Leopard system, which has no knowledge of its special significance, it shows up as plain as day.

Even more information is revealed with the help of Mac OS X Internals guru Amit Singh's hfsdebug program, which has quietly been updated for Snow Leopard.

And sure enough, as we saw, the resource fork does indeed contain the compressed data. Still, why the resource fork? It's all part of Apple's usual, clever backward-compatibility gymnastics. A recent example is the way that hard links to directories show up—and function—as aliases when viewed from a pre-Leopard version of Mac OS X.

In the case of a HFS+ compression, Apple was (understandably) unable to make pre-Snow Leopard systems read and interpret the compressed data, which is stored in ways that did not exist at the time those earlier operating systems were written. But rather than letting applications (and users) running on pre-10.6 systems choke on—or worse, corrupt through modification—the unexpectedly compressed file contents, Apple has chosen to hide the compressed data instead.

And where can the complete contents of a potentially large file be hidden in such a way that pre-Snow Leopard systems can still copy that file without the loss of data? Why, in the resource fork, of course. The Finder has always correctly preserved Mac-specific metadata and both the resource and data forks when moving or duplicating files. In Leopard, even the lowly cp and rsync commands will do the same. So while it may be a little bit spooky to see all those "empty" 0 KB files when looking at a Snow Leopard disk from a pre-Snow Leopard OS, the chance of data loss is small, even if you move or copy one of the files.

The resource fork isn't the only place where Apple has decided to smuggle compressed data. For smaller files, hfsdebug shows the following:

That's right, an entire file's contents stored uncompressed in an extended attribute. In the case of a standard PkgInfo file like this one, those contents are the four-byte classic Mac OS type and creator codes.

There's still the same "fpmc..." preamble seen in all the earlier examples of the com.apple.decmpfs attribute, but at the end of the value, the expected data appears as plain as day: type code "APPL" (application) and creator code "emal" (for the Mail application—cute, as per classic Mac OS tradition).

You may be wondering, if this is all about data compression, how does storing eight uncompressed bytes plus a 17-byte preamble in an extended attribute save any disk space? The answer to that lies in how HFS+ allocates disk space. When storing information in a data or resource fork, HFS+ allocates space in multiples of the file system's allocation block size (4 KB, by default). So those eight bytes will take up a minimum of 4,096 bytes if stored in the traditional way. When allocating disk space for extended attributes, however, the allocation block size is not a factor; the data is packed in much more tightly. In the end, the actual space saved by storing those 25 bytes of data in an extended attribute is over 4,000 bytes.

But compression isn't just about saving disk space. It's also a classic example of trading CPU cycles for decreased I/O latency and bandwidth. Over the past few decades, CPU performance has gotten better (and computing resources more plentiful—more on that later) at a much faster rate than disk performance has increased. Modern hard disk seek times and rotational delays are still measured in milliseconds. In one millisecond, a 2 GHz CPU goes through two million cycles. And then, of course, there's still the actual data transfer time to consider.

Granted, several levels of caching throughout the OS and hardware work mightily to hide these delays. But those bits have to come off the disk at some point to fill those caches. Compression means that fewer bits have to be transferred. Given the almost comical glut of CPU resources on a modern multi-core Mac under normal use, the total time needed to transfer a compressed payload from the disk and use the CPU to decompress its contents into memory will still usually be far less than the time it'd take to transfer the data in uncompressed form.

That explains the potential performance benefits of transferring less data, but the use of extended attributes to store file contents can actually make things faster, as well. It all has to do with data locality.

If there's one thing that slows down a hard disk more than transferring a large amount of data, it's moving its heads from one part of the disk to another. Every move means time for the head to start moving, then stop, then ensure that it's correctly positioned over the desired location, then wait for the spinning disk to put the desired bits beneath it. These are all real, physical, moving parts, and it's amazing that they do their dance as quickly and efficiently as they do, but physics has its limits. These motions are the real performance killers for rotational storage like hard disks.

The HFS+ volume format stores all its information about files—metadata—in two primary locations on disk: the Catalog File, which stores file dates, permissions, ownership, and a host of other things, and the Attributes File, which stores "named forks."

Extended attributes in HFS+ are implemented as named forks in the Attributes File. But unlike resource forks, which can be very large (up to the maximum file size supported by the file system), extended attributes in HFS+ are stored "inline" in the Attributes File. In practice, this means a limit of about 128 bytes per attribute. But it also means that the disk head doesn't need to take a trip to another part of the disk to get the actual data.

As you can imagine, the disk blocks that make up the Catalog and Attributes files are frequently accessed, and therefore more likely than most to be in a cache somewhere. All of this conspires to make the complete storage of a file, including both its metadata in its data, within the B-tree-structured Catalog and Attributes files an overall performance win. Even an eight-byte payload that balloons to 25 bytes is not a concern, as long as it's still less than the allocation block size for normal data storage, and as long as it all fits within a B-tree node in the Attributes File that the OS has to read in its entirety anyway.

There are other significant contributions to Snow Leopard's reduced disk footprint (e.g., the removal of unnecessary localizations and "designable.nib" files) but HFS+ compression is by far the most technically interesting.

Installer intelligence

Apple makes two other interesting promises about the installation process:

Snow Leopard checks your applications to make sure they're compatible and sets aside any programs known to be incompatible. In case a power outage interrupts your installation, it can start again without losing any data.

The setting aside of "known incompatible" applications is undoubtedly a response to the "blue screen" problems some users encountered when upgrading from Tiger to Leopard two years ago, which was caused by the presence of incompatible—and some would say "illicit"—third-party system extensions. I have a decidedly pragmatic view of such software, and I'm glad to see Apple taking a similarly practical approach to minimizing its impact on users.

Apple can't be expected to detect and disable all potentially incompatible software, of course. I suspect only the most popular or highest profile risky software is detected. If you're a developer, this installer feature may be a good way to find out if you're on Apple's sh*t list.

As for continuing an installation after a power failure, I didn't have the guts to test this feature. (I also have a UPS.) For long-running processes like installation, this kind of added robustness is welcome, especially on battery-powered devices like laptops.

I mention these two details of the installation process mostly because they highlight the kinds of things that are possible when developers at Apple are given time to polish their respective components of the OS. You might think that the installer team would be hard-pressed to come up with enough to do during a nearly two-year development cycle. That's clearly not the case, and customers will reap the benefits.

Share this story

John Siracusa
John Siracusa has a B.S. in Computer Engineering from Boston University. He has been a Mac user since 1984, a Unix geek since 1993, and is a professional web developer and freelance technology writer. Emailsiracusa@arstechnica.com//Twitter@siracusa