So, Why?

Why a new file system? Apple’s documentation and session do a good job of describing their motivation: the old HFS+ design was introduced back in 1998, and was based in part on the original HFS which was introduced back in 1985. (Historical note, the prior Mac file system, MFS, was a flat file system. It did not have directories. Folders were a Finder-only display mechanism. The H in the acronym stands for Hierarchical). The iMac of 1998 had a 233mhz G3 processor, 32 megs of RAM, and an incredibly huge 4 gigs of disk space. We’ve progressed a long way since then in terms of processor and memory capability.

Apple has done amazing things with HFS+ since its introduction, making it work for much larger disk volumes and holding much larger files than we used to manipulate. They also added journaling to give resilience in the face of sudden power loss and the occasional system crash.

But now it’s time to move on to something a bit more modern and flexible.

Dominic Giampaolo

When I first heard of APFS, I wondered if Dominic Giampaolo was involved. And sure enough, there he was, presenting at the APFS session.

Who is Dominic? He wrote the BeOS file system beck in the 90s, which at the time was incredible, doing things unheard of in a PC operating system: high performance, journaling, with a clever file system metadata system that could be queried like a database. He later went to Apple, and soon thereafter HFS+ got extended attributes and journaling. And then after that we got Spotlight.

He also wrote a book on the BeOS file system, which is available as a PDF as well. It’s a fun read, even if file systems aren’t your primary technical love.
Obviously APFS is the work of a big team of talented engineers, but it’s nice knowing some of the humans behind the software.

Playing with APFS

The first version of APFS is available in the macOS Sierra WWDC beta. There are some limitations on what you can do it with it—mainly you can’t boot from it. Also, Apple is definitely saying that they do not guarantee that an Apple File System volume created today will be readable in future releases. To play with it now, their guidance is to create a disk image and mount it.

Creating a volume

It’s very easy to create an empty volume. I figure if you’re excited enough about a new file system to have read this far, that you’re comfortable in the terminal.

Use the hdiutil command to create and manipulate disk images:

% hdiutil create -size 100m -fs APFS -volname "APFS" new-filesystem.dmg
WARNING: You are using a pre-release version of the Apple File System called
APFS which is meant for evaluation and development purposes only. Files
stored on this volume may not be accessible in future releases of OS X.
You should back up all of your data before using APFS and regularly back up
data while using APFS, including before upgrading to future releases of OS X.
Continue? [y/N]

You can tell by the wall of text that Apple is really serious that this is sharp-edged pre-release stuff.

The hdiutil command-line arguments are pretty self explanatory. This creates a 100 meg disk image with the APFS file system, with the volume name also being APFS. new-filesystem.dmg is ready for mounting. You can double-click it in the finder, or use hdiutil mount:

Unfortunately, you can’t use the srcfolder argument to hdiutil to pre-populate a disk image from a directory. You get a “Operation not permitted” error - rdar://26822248. Also, be aware that creating a very large disk image could take a fair amount of time and seem to lock up your machine.

COWabunga

Copy On Write (a.k.a. COW) seems to be making a resurgence these days. It’s been used forever by the virtual memory system when you fork a new process. Rather than duplicating all the memory pages for the new process, they’re just linked. A page is only copied when you write to it.

COW is also used when implementing Swift value types. They’re often a structure that has a reference to a heavier, dynamically allocated structure. This structure is only copied when someone attempts to modify it.

APFS’s “Cloning” feature is COW for the file system. When you duplicate files in the Finder (and via NSFileManager), the data isn’t duplicated, just a bunch of references to the same blocks on disk. These blocks are only copied if the file gets modified. There’s more detail in the APFS WWDC session.

You can prove to yourself that APFS is doing COW by using the diskutil utility. The diskutil manpage has a section describing the different moving parts of APFS:

What I did was create an empty disk image, and added a small text file and a cat photo.

diskutil APFS list will show you some interesting information:

% diskutil APFS list
WARNING: You are using a pre-release version of the Apple File System called
APFS which is meant for evaluation and development purposes only.
Files stored on APFS volumes may not be accessible in future releases
of OS X. You should back up all of your data before using APFS and
regularly back up data while using APFS, including before upgrading
to future releases of OS X.
You can pass the "-IHaveBeenWarnedThatAPFSIsPreReleaseAndThatIMayLoseData"
option between the "APFS" verb and the APFS sub-verb to bypass this message.

I love the sense of humor the low-level engineers have, as demonstrated by the -IHaveBeenWarnedThatAPFSIsPreReleaseAndThatIMayLoseData option.

Just to prove that it’s not miscalculating, I edited several of the cat pictures, re-ran the command and got:

Space-Sharing Current Volume Size = 9.0 MB (8986624 Bytes)

So only the edited files consume additional space.

What’s In It For Us

For the most part, this new file system is just an interesting aside. It won’t have that much of an impact for almost anyone’s day-to-day work. Once it’s deployed, we’ll notice our disk space being used more efficiently and many I/O operations being more efficient. If you program for the Mac, the sparse file support will make using core dump files more tractable. Some kinds of operations, such as calculating the on-disk footprint of a directory, will need different implementations on APFS, because a simple “walk the files and sum them up” will over-report disk usage in the face of cloning.

But me, I like low-level things like file systems. It’s interesting the kinds of thought involved and what some of the different tradeoffs are (such as APFS optimizing latency vs raw throughput), and it’s good nerdy fun.