This chapter is from the book

Understanding File System Metadata

Metadata is data about data. More specifically, metadata is information used to describe content. The most basic forms of file and folder metadata employed by nearly every operating system are names, paths, modification dates, and permissions. These metadata objects are not part of the item’s content, yet they are necessary to describe the item in the file system. Lion uses several types of additional file system metadata for a variety of technologies that ultimately lead to a richer user experience.

Mac OS Extended Metadata

Resource forks, dating back to the original Mac OS, are the legacy metadata technology in the Macintosh operating system. To simplify the user experience, Apple created a forked file system to make complex items, such as applications, appear as a single icon. Forked file systems, like Mac OS Extended, allow multiple pieces of data to appear as a single item in the file system. In this case, a file will appear as a single item, but it is actually composed of two separate pieces, a data fork and a resource fork. This also allows the Mac OS to support standard file types in the data fork, while the extra Mac-specific information resides in the resource fork. For many years the Mac OS has relied on forked files for storing both data and associated metadata.

Lion not only continues but also expands the use of metadata, even going so far as to allow developers to take advantage of an arbitrary number of additional metadata items. This enables Apple, and other developers, to implement unique file system solutions without having to modify the existing file system. For instance, Mac OS X v10.6 introduced compressed application code, wherein the actual executable program files are all compressed to save space and then when needed automatically decompressed on the fly. To prevent previous versions of Mac OS X or older applications from improper handling of these compressed executables, Apple chose to hide the compressed bits in additional metadata locations.

The downside to legacy resource forks, and other types of additional file system metadata, is that some third-party file systems, like FAT, do not know how to properly store this additional data. The solution to this issue is addressed with the AppleDouble file format covered later in this chapter.

File Flags and Extended Attributes

Lion also uses file system flags and extended attributes to implement a variety of file system features. In general, file system flags are holdovers from the original Mac OS and are primarily used to control user access. Examples of file system flags include the locked flag covered in Chapter 3, “File Systems,” and the hidden flag covered previously in this chapter.

With Mac OS X, Apple needed to expand the range of possible attributes associated with any file or folder, which is where so-called extended attributes come into play. Any process or application can add an arbitrary number of custom attributes to a file or folder. Again, this allows developers to create new forms of metadata without having to modify the existing file system. The Mac OS Extended file system will store any additional attributes as another fork associated with the file.

The Finder uses extended attributes for several general file features, including setting an item’s color label, stationary pad option, hide extension option, and Spotlight comments. All of these items can be accessed from the Finder’s Get Info window.

Metadata via Terminal

From Terminal’s command line, you can verify that an item has additional file system metadata present using the ls command with both the long list option, -l, and the -@ option. In the following example, Michelle uses the ls command to view the file system metadata associated with an alias file and the file shown in the previous Get Info window screen shot.

Note the @ symbol at the end of the permissions string, which indicates the item has additional metadata. This symbol is shown any time you perform a long listing. For the sake of simplification, using ls -l@ combines the viewing of both resource fork and extended attribute data. The indented lines below the primary listing show the additional metadata that the Finder has added. In the case of the alias file, it’s clear from the file sizes that the resource fork is used to store the alias data.

Bundles and Packages

Sometimes forked files aren’t the most efficient solution for hiding data, especially if you have a lot of related files that you need to hide. So instead of creating a new container technology, Apple simply modified an existing file system container, the common folder. Bundles and packages are nothing more than common folders that happen to contain related software and resources. This allows software developers to easily organize all the resources needed for a complicated product into a single bundle or package, while discouraging normal users from interfering with the resources.

Bundles and packages use the same technique of combining resources inside special folders. The difference is that the Finder treats packages as opaque objects that, by default, users cannot navigate into. For example, where a user sees only a single icon in the Finder representing an application, in reality it is a folder potentially filled with thousands of resources. The word “package” is also used to describe the archive files used by the installer application to install software—that is, installer packages. This is appropriate, though, as users cannot, by default, navigate into the contents of a legacy installer package because the Finder again displays it as a single opaque object. Starting with Mac OS X v10.5, Apple allowed the creation of fully opaque installation packages wherein the entire contents are inside a single file, further preventing users from accidentally revealing installation content.

The anatomy of an installer package is quite simple; it usually contains only a compressed archive of the software to be installed and a few configuration files used by the installer application. Other software bundles and packages, on the other hand, are often much more complex as they contain all the resources necessary for the application or software.

Software bundles or packages often include:

Executable code for multiple platforms

Document description files

Media resources such as images and sounds

User interface description files

Text resources

Resource forks

Resources localized for specific languages

Private software libraries and frameworks

Plug-ins or other software to expand capability

Although the Finder default is to hide the contents of a package, you can view the contents of a package from the Finder. To access a package’s contents in the Finder, simply right-click or Control-click on the item you wish to explore, and then choose Show Package Contents from the shortcut menu. (You may recall this technique is used in Chapter 1, “Installation and Configuration,” to reveal the installation disk image inside the Install Mac OS X Lion application.)

Nevertheless, you should be very careful when exploring this content. Modifying the content of a bundle or package can easily leave the item unstable or unusable. If you can’t resist the desire to tinker with a bundle or package, you should always do so from a copy and leave the original safely intact.

More Info

Tools for creating and modifying bundles and packages are included with the optional Xcode Developer Tools package, which can be found in the Mac App Store.

AppleDouble File Format

While file system metadata helps make the user’s experience on Lion richer, compatibility with third-party file systems can be an issue. Only volumes formatted with the Mac OS Extended file system fully support Mac OS X–style resource forks, data forks, file flags, and extended attributes. Third-party software has been developed for Windows-based operating systems to allow them to access the extended metadata features of Mac OS Extended. More often, though, users will use the compatibility software built into Lion to help other file systems cope with these metadata items.

For most non-Mac OS volumes, Lion stores the file system metadata in a separate hidden data file. This technique is commonly referred to as AppleDouble. For example, if you copy a file containing metadata named “My Document.docx” to a FAT32 volume, Lion will automatically split the file and write it as two discrete pieces on the FAT32 volume. The file’s internal data would be written with the same name as the original, but the metadata would end up in a file named ._My Document.docx that would remain hidden from the Finder. This works out pretty well for most files because Windows applications only care about the contents of the data fork. But, some files do not take well to being split up, and all the extra dot-underscore files create a bit of a mess on other file systems.

NOTE

Window systems default to automatically hiding these “period-underscore” files. In fact, to acquire the prior Windows XP screenshot, showing hidden files had to be manually enabled.

NOTE

Because bundles and packages are really just special folders, these items simply copy over to non–Mac OS volumes as regular folders. The Finder will continue to recognize the items as bundles or packages even when they reside on a third-party volume.

Mac OS X v10.5 introduced an improved method for handling metadata on SMB network shares from NTFS volumes that doesn’t require the AppleDouble format. The native file system for modern Windows-based computers, NTFS, supports something similar to file forking known as alternative data streams. The Mac’s file system will write the metadata to the alternative data stream so the file will appear as a single item on both Windows and Mac systems.

NOTE

Lion will always revert to using “dot underscore” files when writing to FAT and UFS volumes or older NFS shares.

AppleDouble Files via Terminal

Historically, UNIX operating systems have not used file systems with extensive metadata. As a result, many UNIX commands do not properly support this additional metadata. These commands can manipulate the data fork just fine, but they often ignore the additional metadata, leaving files damaged and possibly unusable. Fortunately, Apple has made some modifications to the most common file management commands, thus allowing them to properly work with all Mac files and support the AppleDouble format when necessary. Metadata-friendly commands in Lion include cp, mv, and rm.

In the following example, Michelle will use the metadata-aware cp command to copy a file on her desktop called ForkedDocument.tiff to a FAT32 volume. Note that the file is a single item on her desktop, but on the FAT32 volume it’s in the dual-file AppleDouble format. The metadata part is named with a preceding period-underscore. Finally, Michelle will remove the file using the metadata-aware rm command. Note that both the data and the metadata part are removed from the FAT32 volume.