On Dec.23 I shared the following the Dru in #PCBSD. It was written by
my co-worker on Oct.10. She requested that he post it here. Since I
follow this list, and he doesn't I am going to post it here for him.
Thanks.
Lars
FreeBSD needs work
last modified on 2010-10-10 at 19:33 - keywords: freebsd zfs iscsi whining
Warning: This is not a post whining about Adobe Flash or how hard it is
to get my playstation controller to work in Quake.
This is, instead, a post whining about how hard it is to use ZFS and
iSCSI on FreeBSD. Recently I was tasked with building some
storage-related projects. The tools of choice involved FreeBSD sitting
on some hardware with a large collection of high-volume drives. These
drives were presented JBOD-style to the host OS, and a couple of them
were SSDs. The idea was to make one massive raidz2 zpool with a mirrored
ZIL on the SSDs. Then we could spin off zvols and pitch them via iSCSI
to our application and database servers.
Sound simple? It isn't. To get it working, we had to switch to
OpenSolaris. Which is a bit of a damn problem.
There were two really bad problems in FreeBSD that prevented us from
deploying our OS of choice to these systems.
Problem the first:
ZFS on FreeBSD (at zpool version 13 as of this project) is only like 80%
there. What I mean by that is ZFS is friggin fantastic for your massive
desktop nerdmachine, or your creepy 8TB basement "media archive", or any
other low-traffic consumer-grade project you might want to undertake.
But when you start putting some load on it...
The biggest problem I had seemed to be caused by multiple read
operations during snapshot creation. If you took a snapshot while lots
of iSCSI (or NFS) (or local) access is going on, the zfs process gets
stuck with a wait state. What it boils down to is that you have to be
super careful to disable access while you're creating (or, in bad cases,
sending) snapshots off -- which completely wrecks the point of the damn
snapshot functionality.
Oh, and it would crash BSD too. Sometimes it would just eat all the RAM
and the machine would fall down.
Problem the second:
iSCSI support in FreeBSD is abominable. I'm given to understand this is
because the main iSCSI dev in FreeBSD-land is possessed of insufficient
hardware to model high-performance workloads. If that's the case, we
need to get that man some damn hardware. Or convince Covad to update
their AoE stuff, since AoE is nicer anyway.
If the iSCSI traffic crossed subnets, performance would tank. If the
iSCSI targets were accessed from a non-FreeBSD initiator, performance
would tank. If a FreeBSD initiator accesses a non-FreeBSD target,
performance tanks. Are you seeing a pattern here? Best-practices
objections aside, it's clear that the dev has a handful of machines on a
dumb switch, and that's the test platform. As soon as you instantiate
some sophistication, the whole thing falls down. Again.
How do we fix this?
I suppose I could devote some time to mastering the implementation
details of iSCSI and ZFS, then fix the stuff myself. That's not really
the best use of my time, however, and I'm not in a position to get paid
to do that sort of thing. But there are a few things anyone (myself
included) can do:
test.
* report bugs.
* provide stack traces and failure scenarios.
* whine constantly.
* provide testing hardware.
All of these are helpful, especially the constant whining. The problem
needs to be front-and-center, or it'll get de-prioritized in favor of
other (and in my opinion) less important problems. I don't give a crap
that your video card doesn't push hard enough to run EVE in FreeBSD. I
want the ZFS functionality that now only FreeBSD actively provides, and
I want the iSCSI functionality ZFS was designed to enable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pcbsd.org/pipermail/testing/attachments/20101223/572633cc/attachment.html>