When it comes to dealing with storage, Solaris 10 provides admins with more choices than any other operating system. Right out of the box, it offers two filesystems, two volume managers, an iscsi target and initiator, and, naturally, an NFS server. Add a couple of Sun packages and you have volume replication, a cluster filesystem, and a hierarchical storage manager. Trust your data to the still-in-development features found in OpenSolaris, and you can have a fibre channel target and an in-kernel CIFS server, among other things. True, some of these features can be found in any enterprise-ready UNIX OS. But Solaris 10 integrates all of them into one well-tested package. Editor's note: This is the first of our published submissions for the 2008 Article Contest.

"Andrew Morton has famously called ZFS a "rampant layering violation" because it combines the functionality of a filesystem, volume manager, and RAID controller. I suppose it depends what the meaning of the word violate is. While designing ZFS we observed that the standard layering of the storage stack induces a surprising amount of unnecessary complexity and duplicated logic. We found that by refactoring the problem a bit -- that is, changing where the boundaries are between layers -- we could make the whole thing much simpler.

....

The ZFS architecture eliminates an entire layer of translation -- and along with it, an entire class of metadata (volume LBAs). It also eliminates the need for hardware RAID controllers. At the same time, it provides a useful new interface -- object storage -- that was previously inaccessible because it was buried inside a monolithic filesystem.

Well, a filesystem is not a volume manager is not a RAID device. They're successive containers for each other.

It also eliminates the need for hardware RAID controllers.

There must be a shared font of Sun anti-freeze somewhere, because if Sun believes that it eliminates the need for hardware RAID controllers then they don't know what hardware RAID controllers are used for - which is worrying if they're writing a new storage system.

There must be a shared font of Sun anti-freeze somewhere, because if Sun believes that it eliminates the need for hardware RAID controllers then they don't know what hardware RAID controllers are used for - which is worrying if they're writing a new storage system.

There must be a shared font of Sun anti-freeze somewhere, because if Sun believes that it eliminates the need for hardware RAID controllers then they don't know what hardware RAID controllers are used for - which is worrying if they're writing a new storage system.

A hardware RAID is used for exactly the same purposes as a software implementation except the hardware one does consume less CPU time. On the other hand it's much more difficult to improve it's capabilities without having to buy a new RAID card.

Well, a filesystem is not a volume manager is not a RAID device. They're successive containers for each other.

So, kind of like, storage devices -> zpool -> raidz -> zfs

Stop thinking of ZFS as just a filesystem, and start thinking of it as a storage management system, and you'll start to see that it's not all that different from what you wrote (filesystem on top of volume manager on top of raid on top of storage device).

There must be a shared font of Sun anti-freeze somewhere, because if Sun believes that it eliminates the need for hardware RAID controllers then they don't know what hardware RAID controllers are used for

Other than continuous background verify, faster rebuilds, hot-swappable hardware support, and lower CPU use, there's really not a lot that hardware RAID gives you that software RAID doesn't. And it's a lot easier to move a disk with a software RAID volume on it between systems if hardware dies. Even if moving between non-identical systems.

which is worrying if they're writing a new storage system.

So which is it, filesystem or storage system? You just contradicted yourself.

As one of the engineers of ZFS explains it about the rampant layering violation:

From my understanding, it's a misconception and poor naming that led to the "layering violations". If Sun had called it ZSMS (Zetabyte Storage Management System) then no one would bat an eye. But, since they called it ZFS, everyone has got their panties in a knot.

If you look at ZFS, it's more than just a filesystem. It's a storage management system, that lets you add/remove storage devices from a storage pool, create raid volumes using space in that pool, create storage volumes using space in that pool, and format those volumes using a 128-bit filesystem with a bunch of nifty features. But, you can also put UFS on top of the volumes (and probably others). And export the volumes out using network storage protocols (NFS, and I think CIFS).

ZSMS gives you a nice, unified way to do the same kinds of things as the cluster-* that is MD, device-mapper, LVM, mkfs.*, *mnt, mount.*, FUSE, and who knows what other crap you'd have to use to get the same featureset. What really irks me about the Linux tools is how non-standardised they are (why are some tools mount.fs while others are fsmnt?), how out of whack with each other they are, and how obviously un-designed to work together they are.

Now, you want a nicely layered approach to storage, then have a look at FreeBSD's GEOM: separate RAID modules (0,1,3,5), encryption, journalling, remote block device, volume management, and more, that can all be neatly stacked in any order, using standard FS tools. All with normalised commandline options, all with normalised command names, all with normalised module names, all designed to work together.

Kind of like ZFS (all that's missing is a pooled storage module, 128-bitness, and checksumming).