My Discussions

Metadata, the Mac, and you

The legacy of file name extensions

Let's look at the consequences of the fateful development of what we know today as file name extensions. The removal of a dedicated storage area for file type metadata and its subsequent integration into the file name means that it is now possible to change an immutable piece of metadata (file type) by changing an independent piece of metadata (file name).

It may be difficult to see this as anything other than a good (or at least benign) situation if you are accustomed to it and simply accept it as "the way of things." Here's an example that may help you see this design decision in a more objective light. Instead of file type, imagine that a different piece of immutable metadata was encoded in the file name instead of being stored separately: file size.

The DOS-era reality would be files like "REPORT.294" or "EDIT.559." Years later, mixed case, long file names, reduced length restrictions, and improved representation would give us "Book Report.50K" or "Microsoft Word.15MB."

You may be reading this now and thinking that it's ridiculous--a straw man example that is in no way analogous to file name extensions. Again, I ask you to let go of your preconceived ideas about "the way things work" and examine this example with respect to the fundamentals of metadata discussed earlier.

Regardless of whether file type or file size is encoded in the file name, the situation is the same: a piece of immutable metadata is encoded in the file name. As immutable metadata, both a file's type and a file's size should never change without an accompanying change to the data itself. And yet this is exactly the action made possible by encoding either of these pieces of immutable metadata in the file name.

With the explosion of personal computers in the 1980s and onward, mechanisms were introduced to combat this design flaw. The interface to file name editing was eventually restricted to prevent or deter the user from changing file type metadata when editing the file name. With the advent of the dominant graphical user interface in Windows, file name extensions were hidden entirely by default.

The choice to encode file type metadata in the file name had other effects beyond its impact on the user experience. Most significantly, it eventually resulted in the disappearance of a dedicated storage area for file type metadata in file systems throughout the industry.

The end result is a computing environment that millions of people use today in the form of Windows, most without thinking very hard about how it came to be, and if there might be a better way.

The Mac Way

Early in the personal computer revolution, Apple was, in fact, thinking about how it could be done better. The work that culminated in the Macintosh in 1984 brought the graphical user interface into the mainstream, where it was followed by DOS's migration to Windows on the soon-to-be dominant PC platform.

But the PC platform could not follow all of Apple's leads. It could not, for example, immediately incorporate a linear memory address space simply because the Mac had it. There was an existing investment in a particular CPU which dictated the memory architecture on the PC, at least in the short term. Similarly, when adopting the GUI ideas pioneered by the Mac, Windows could not adopt the accompanying file metadata system Apple had developed due to the incompatibility with their substantial installed base. But what, exactly, had Apple done?

If you think back to the fundamentals portion of this article, Apple's decisions regarding file metadata are obvious to the point of being boring. They decided what metadata they wanted to store, and they put each piece of metadata into a dedicated location in the metadata structures of the file system. They stored all of the items listed earlier (location, name, size, type, dates, permissions) in one form or another. It was a very straight-forward implementation.

Unfortunately, by 1984, a straight-forward implementation of metadata flew in the face of the status quo. By then, file type metadata had been essentially removed from the subset of file metadata shared across all platforms. The list of truly common metadata was reduced to a file's name (modulo length restrictions), size, and one or more dates. Any other piece of metadata was not assured a storage location on a "foreign" platform.

File type, as essential as it is to the user experience, was dropped from that list as a result of the fateful decision to encode file metadata in the file name. Had that decision not been made, file type would most assuredly still have an independent storage location on virtually every platform just as file name, size, and a small set of associated dates do today.

Into this world came Apple with file type information that was not encoded in the file name. This vastly improved the user experiences on the Mac platform, and became a hallmark of what was known as "The Mac Way." Mac users reveled in their ability to give files logical names without regard to file type. Confusing identical names within the same location were not permitted. When file type metadata was displayed, it appeared as a verbose human-readable string like "Microsoft Word Document", which allowed the native storage format to remain a seemingly restrictive 32-bit value for decades without fear of obfuscation.

Apple further refined the user experience by including many more pieces of metadata beyond those found on other platforms. The most influential was the storage of metadata that indicated the application that created the file. Application binding on the Mac used the file's creator metadata to choose an application (falling back to the file's type if necessary). This meant that two files with the same type may open in two different applications. One text file containing a grocery list may open in a simple text editor, while another containing HTML code may open in a GUI HTML editor or a web browser. This application binding process was completely independent of the file's name.

Apple touted the user experience provided by type/creator application binding and full user ownership of the file name, listing these features in its promotional literature, and, most famously, ridiculing the introduction of Windows 95 with an ad that read:

C:ONGRTLNS.W95

To quote industry observer Geoff Duncan at the time, "Perhaps the saddest part about this particular Apple ad is that people understand it."

But no Mac is an island, and the spread of pervasive networking in the form of the Internet brought the metadata sins of the past into garden of sanity created by the Mac.

John Siracusa / John Siracusa has a B.S. in Computer Engineering from Boston University. He has been a Mac user since 1984, a Unix geek since 1993, and is a professional web developer and freelance technology writer.