Or… stuff and nonsense.

Main menu

Post navigation

ZFS: How its design seems to be more trouble than its worth.

Now, let me say this first; ZFS seems like a wonderful thing. In fact, it is wonderful except for a couple of things, which makes it totally undeployable for our new server. Actually, let’s put this another way. One thing makes it impossible because the ZFS way of doing things is mutually exclusive with the way our system (and probably a huge number of other legacy systems) works.

The main bugbear is what the ZFS development team laughably call quotas. They aren’t quotas, they are merely filesystem size restraints. To get around this the developers use the “let them eat cake” mantra, “creating filesystems is easy” so create a new filesystem for each user, with a “quota” on it. This is the ZFS way.

Unfortunately, this causes a number of problems (above the fact that there’s no soft quota). Firstly, no instead of having only a few filesystems mounted you have “system mounts + number of users” mounted filesystems, which makes df a pain to use. Secondly, there’s no way of having a shared directory structure with individual users having separate file quotas within it. But finally, and this is the critical problem, each user’s home directory is now a separate NFS share.

At first look that final point doesn’t seem to be much of a worry until you look at the implications that brings. To cope with a distributed system with a large number of users the only managable way of handling NFS mounts is via an automounter. The only alternative would be to have an fstab/vfstab file holding every filesystem any user might want. In the past this has been no problem at all, for all your user home directories on a server you could just export the parent directory holding all the user home directories and put a line “users -rw,intr myserver:/disks/users” and it would work happily.

Now, with each user having a separate filesystem this breaks. The automounter will mount the parent filesystem as before but all you will see are the stub directories ready for the ZFS daughter filesystems to mount onto and there’s no way of consolidating the ZFS filesystem tree into one NFS share or rules in automount map files to be able to do sub-directory mounting.

Of course, the ZFS developers would argue that you should change the layout of your automounted filesystems to fit with the new scheme. This would mean that users’ home directories would appear directly below /home, say.

The problem here is one of legacy code, which you’ll find throughout the academic, and probably commercial world. Basically, there’s a lot of user generated code which has hard coded paths so any new system has to replicate what has gone before. (The current system here has automount map entries which map new disks to the names of old disks on machines long gone, e.g. /home/eeyore_data/ )

The ZFS developers don’t seem to see real-world problems, or maybe they don’t WANT to see them as it would make thier lives more complicated. It’s far easier to be arrogant and use the “let them eat cake” approach rather than engineer a real solution to the problem, such as actually programming a true quota system.

As it is, it seems that for our new fileserver I’m going to have to back off from ZFS and use the old software device concatenation with UFS on top, which is a right pain and not very resilient.

This is not an uncommonly expressed lament. It’s on the ZFS team’s radar (see 6501037), but there are enough issues of higher priority that it may be some time before it’s available. If enough customers make enough noise, however…

The problem is that the customers will probably never ever complain directly. They’ll merely go back to Offline: DiskSuite’s decendant and give up on ZFS for a generation.

I’m guessing that the higher priority issues are probably to do with performance (as in speed) and making the filesystem bootable.

As for the issues I raise in my peice, thefact that there’s no real work-around to a fundamental operational problem means that, in my opinion, ZFS is not actually ready for real-world use other than in a very restricted manner such as a mailstore or some other use where a monolithic, high reliability, growing filesystem is required.

It also doesn’t help when some of the evangelists for the filesystem use language which makes the customers feel as though they couldn’t care less about their problem. That’s where the reference to the “let them eat cake” came in.

P.S. The main reason customers will neglect to give feedback is that there seems to be no channel to do so other than by raising a software support call with Sun Support. However, seeing as this sort of problem is a “feature” rather than a bug I don’t see that it would get through the filters.

On a related front, the seeming lack of understanding of real world operational realities is probably due to them being isolated in a developers’ bubble and not realising how people use systems and what problems they have to deal with. I’ve seen this happen so often with talented coders, they’re so dazzled by the beauty and devious cleverness of the thing that they’ve produced that they can’t see why anyone would want to do things any other way than the way that they’ve envisioned things to be.

Unfortunately, reality sucks. It’s not beautiful or perfect and we at the coal face have to deal with it. Maybe the dev team member should be seconded to sysadmin work for a while. 🙂

I think you are not in a good moon that’s all
zfs developpers are on zfs-discuss@opensolaris.org and they get a lot of feedback from users…feedbacks that are often converted to bugids on opensolaris.org
they are a lot of ways to send feedback (for example on blogs.sun.com)
and the issues you’ve been faced got their attention as well as other users (like me)
selim