Meddling about with the OS X system

OS X Overview

OS X Overview

Mac OS X is a UNIX-based operating system with modern GUI and application support frameworks layered on top. The lowest layer, Darwin, includes the kernel, device drivers and driver support frameworks, a BSD personality layer, and various libraries and command-line utilities.

The Darwin layers of Mac OS X are open source. This serves two main purposes: to provide a resource for other open source development efforts (such as Linux and BSD variants) to make their software available on Mac hardware and to provide source code to aid developers writing device drivers and other low-level technologies for Mac OS X.

In addition to being part of Mac OS X, Darwin is a standalone, BSD-based operating system. BSD stands for Berkeley Software Distribution. It used to be called the Berkeley version of Unix but is now referred to as BSD, so referring to it as Unix is a bit incorrect. For all practical purposes, however, the term is synonymous with “a version of Unix.”

In discussing the layers of Mac OS X, what determines whether something is considered to be at a higher or lower level? The answer is that as a general rule, a component at a lower level is used by all higher-level layers. But the converse is not necessarily true. Thus, an application, whether it is Carbon and Cocoa, uses the core Darwin technology, but Darwin can function without a need for any additional application layer.

More In-depth Specifics About the OS X kernel:

The Mac OS X kernel is called XNU. it is comprised of these components:

Mach:
XNU contains code based on Mach, the legendary architecture that originated as a research project at Carnegie Mellon University in the mid 1980s (Mach itself traces its philosophy to the Accent operating system, also developed at CMU), and has been part of many important systems. Early versions of Mach had monolithic kernels, with much of BSD’s code in the kernel. Mach 3.0 was the first microkernel implementation.

BSD:
XNU’s BSD component uses FreeBSD as the primary reference codebase (although some code might be traced to other BSDs). Darwin 7.x (Mac OS X 10.3.x) uses FreeBSD 5.x. As mentioned before, BSD runs not as an external (or user-level) server, but is part of the kernel itself. Some aspects that BSD is responsible for include:

I/O Kit:
I/O Kit, the object-oriented device driver framework of the XNU kernel is radically different from that on traditional systems.

I/O Kit uses a restricted subset of C++ (based on Embedded C++) as its programming language. This system is implemented by the libkern library. Features of C++ that are not allowed in this subset include:

I/O Kit’s implementation consists of three C++ libraries that are present in the kernel and available to loadable drivers: IOKit.framework, Kernel/libkern and Kernel/IOKit. The I/O Kit includes a modular, layered run-time architecture that presents an abstraction of the underlying hardware by capturing the dynamic relationships between the various hardware/software components (involved in an I/O connection).

Platform Expert:

The Platform Expert is an object (one can think of it as a driver) that knows the type of platform that the system is running on. I/O Kit registers a nub (see below) for the Platform Expert. This nub then loads the correct platform specific driver, which further discovers the buses present on the system, registering a nub for each bus found. The I/O Kit loads a matching driver for each bus nub, which discovers the devices connected to the bus, and so on. Thus, the Platform Expert is responsible for actions such as:

* Building the device tree (as described above)
* Parse certain boot arguments
* Identify the machine (including processor and bus clock speeds)
* Initialize a “user interface” to be used in case of kernel panics

In the context of the I/O Kit, a “nub” is an object that defines an access point and communication channel for a device (a bus, a disk drive or partition, a graphics card, …) or logical service (arbitration, driver matching, power management, …).

libkern and libsa:

As described earlier, the I/O Kit uses a restricted subset of C++. This system, implemented by libkern, provides features such as:

The Core Services layer can be visualized as sitting atop the kernel. This layer’s most important sub-components are CoreFoundation.framework and CoreServices.framework. It contains various critical non-GUI system services (including APIs for managing threads and processes, resource, virtual memory and filesystem interaction):

* CarbonCore: Core parts of Carbon, such as various Carbon managers. Carbon has traditionally been a very critical Mac OS API family, and is so in Mac OS X as well.
* CFNetwork: An API for user-level networking that includes several protocols such as FTP, HTTP, LDAP, SMTP, …
* OSServices: A framework that includes various system APIs (accessing disk partitions, the system keychain, Open Transport, sound, power, etc.)
* SearchKit: A framework for indexing and searching text in multiple languages.
* WebServicesCore: APIs for using Web Services via SOAP and XML-RPC.

CoreFoundation also includes a large number of other services. For example, it provides ways so that applications can access URLs, parse XML, maintain property lists, etc. In the directory /System/Library/Frameworks/, refer to the directory CoreFoundation.framework/Headers/ for headers belonging to CoreFoundation.

Application Services

This layer can be visualized as being on top of Core Services. It includes services that make up the graphics and windowing environment of Mac OS X.

The core of the windowing environment is called Quartz. Quartz consists of broadly two entities:

* Quartz Compositor: consists of the window server (the WindowServer program) and some private libraries. Quartz implements a layered compositing engine, in which every pixel on a screen can be shared between different windows in real time.
* Quartz 2D: a 2D graphics rendering library.

While NeXT used “Display PostScript” (DPS) for the imaging model in NEXTSTEP, Quartz uses PDF for its drawing model (or as its native format). This makes possible some useful features, such as automatic generation of PDF files (you can save a screenshot “directly” to PDF), import of PDF data into native applications, rasterization of PDF data (including PostScript and EPS conversion), etc. There are Python bindings to the Quartz PDF engine. Note however, that Quartz’s PDF support is not a replacement for, say, Adobe’s professional level PDF tools.

Quartz also has an integrated hardware acceleration layer called Quartz Extreme that automatically becomes active on supported hardware.

The graphics environment also has other rendering libraries, for example, OpenGL (2D and 3D), QuickDraw (2D) and QuickTime.

The Application Services layer also includes various other component frameworks:

QuickTime is both a graphics environment and an application environment. It has excellent features for interactive multimedia that allow for manipulating, streaming, storing and enhancing video, sound, animation, graphics, text, music, and VR.

Application Environments

There are multiple execution environments on Mac OS X within which respective applications execute:

* BSD: This application environment is similar to a traditional *BSD system and provides a BSD-based POSIX API. It consists of a BSD runtime and execution environment. Mac OS X uses FreeBSD as a reference code base for its BSD derivations (Panther derives from FreeBSD 5.0). The libraries and headers for this environment reside in their traditional location (/usr/lib and /usr/include, respectively).

* Carbon: This is a set of procedural C-based APIs for Mac OS X that are based on the “old” Mac OS 9 APIs. Note that Carbon does not include all the old APIs – a subset of the old APIs has been modified to work with OS X. Some APIs have been dropped as they are not applicable any more because of the radical differences between Mac OS X and Mac OS 9.

The fact that Mac OS X includes APIs and abstractions from so many different systems (Mach, *BSD, Mac OS 9, etc.) makes things rather confusing and messy sometimes. Consider that Mach uses tasks (that contain one or more threads), FreeBSD uses processes (with a proc structure, pid, etc.) while Carbon uses its own notion of processes in the Carbon Process Manager, with process serial numbers (PSNs) which are not the same as a BSD pid! If a process is running under the Classic virtualizer, then multiple Carbon Process Manager processes inside Classic are using one BSD process. Consider the following excerpt from the output of the ps command:

Safari is linked against both the Carbon and Cocoa frameworks, among others. The above output means that Unix process id 345 maps to Carbon Process Manager PSN 917505.

You can use the Carbon function GetProcessForPID(pid_t, ProcessSerialNumber *) to obtain the PSN for a process given its Unix pid (note that not all processes will have both).

* Classic: This is a compatibility environment so that Mac OS 9 applications can be run on Mac OS X. The Classic application is technically a virtualizer that runs in a protected memory environment, with multiple processes in Mac OS 9 layered on top of one BSD process.
* Cocoa: This is an object-oriented API for developing applications written in Objective-C and Java. Cocoa is an important inheritance from NEXTSTEP (a fact testified by the various NS* names in its API). It is very well supported by Apple’s rapid development tools, and is the preferred way of doing things on Mac OS X if what you want to do can be done through Cocoa. There are many parts of Mac OS X that have not “converted” to Cocoa completely, or at all. A Cocoa application can call the Carbon API. Cocoa is largely based on the OpenStep frameworks, and consists of primarily two parts: the Foundation (fundamental classes) and the Application Kit (classes for GUI elements).
* Java: This environment consists of a JDK, both command-line and integrated with Apple’s IDE, a runtime (Hotspot VM, JIT), and various Java classes (AWT, Swing, …).

Cocoa includes Java packages that let you create a Cocoa application using Java as the programming language. Moreover, Java programs can call Carbon and other frameworks via JNI.

Finally, although Java is considered an Application Environment, the Java subsystem can itself be represented as different layers above the kernel. The Java Virtual Machine along with core JDK packages is analogous to the Core Services layer, and so on (see picture).

OS X Filesystem Hierarchy

Although Mac OS X has many directories similar to a traditional *nix system, such as /etc (a symbolic link to /private/etc, /usr, /tmp (a symbolic link to /private/tmp, etc., it has many others that are unique to it, for example:

Like most modern day operating system implementations, Mac OS X uses an object-oriented vnode layer. xnu’s VFS layer is based on FreeBSD’s, although there are numerous minor differences (for example, while FreeBSD uses mutexes, xnu uses simple locks; XNU’s unified buffer cache is integrated with Mach’s virtual memory layer, and so on).

HFS

HFS (Hierarchical File System) was the primary filesystem format used on the Macintosh Plus and later models, until Mac OS 8.1, when HFS was replaced by HFS Plus.

This section briefly describes the various filesystems supported by “stock” Mac OS X.

HFS+

HFS+ is the preferred filesystem on Mac OS X. It supports journaling, quotas, byte-range locking, Finder information in metadata, multiple encodings, hard and symbolic links, aliases, support for hiding file extensions on a per-file basis, etc. HFS+ uses B-Trees heavily for many of its internals.

Like most current journaling filesystems, HFS+ only journals meta-data. Journaling support was retrofitted into HFS+ via a simple VFS journaling layer in XNU that’s actually filesystem independent. The journal files on an HFS+ volume are called .journal and .journal_info_block (type jrnl and creator code hfs+). HFS+, although not a cutting-edge filesystem, supports some unique features and has worked well for Apple.
Similar to HFS

HFS+ is architecturally similar to HFS, with several important improvements such as:

* 32 bits used for allocation blocks (instead of 16). HFS divides the disk space on a partition into equal-sized allocation-blocks. Since 16 bits are used to refer to an allocation-block, there can be at most 216 allocation blocks on an HFS filesystem. Thus, using 32 bits for identifying allocation blocks results in much less wasted space (and more files).
* Long file names up to 255 characters
* Unicode based file name encoding
* File/Directory attributes can be extended in future (as opposed to being fixed size)
* In addition to a System Folder ID (for starting Apple operating systems), a dedicated startup file that can easily be found (its location and size are stored in the volume header in a fixed location) during startup, is also supported so that non-Apple systems can boot from a HFS+ filesystem
* Largest file size is 263 bytes

ISO9660

ISO9660 is a system-independent file system for read-only data CDs. Apple has its own set of ISO9660 extensions. Moreover, you would likely run into Mac HFS/ISO9660 hybrid discs that contain both a valid HFS and a valid ISO9660 filesystem. Both filesystems can be read on a Mac, while on “other” systems, you would typically read the ISO9660 data. Note that this doesn’t mean there has to be redundant data on the disc: usually the data that needs to be accessed from both Macs and PCs is kept on the ISO9660 volume, and is aliased on the HFS volume.

MSDOS

Mac OS X includes support for MSDOS filesystem (FAT12, FAT16 and FAT32).

NTFS

Mac OS X includes read-only support for NTFS.

UDF

UDF (Universal Disk Format) is the filesystem used by DVD-ROM (including DVD-video and DVD-audio) discs, and by many CD-R/RW packet-writing programs. Note that at the time of this writing, Mac OS X “Panther” only supports UDF 1.5, and not UDF 2.0.

UFS

Darwin’s implementation of UFS is similar to that on *BSD, as was NEXTSTEP’s, but they are not really compatible. Currently, only NetBSD supports it. Apple’s UFS is big endian (as was NeXT’s) – even on x86 hardware. It includes the new Directory Allocation Algorithm for FFS (DirPref). The author of the algorithm offers more details, including some test results, on his site.