Assuming that free space advice for ZEVO will not differ from advice for other modern implementations of ZFS …

Question

Please, what percentages or amounts of free space are advisable for hard disk drives of the following sizes?

640 GB

2 TB

Thoughts

A standard answer for modern implementations of ZFS might be "no more than 96 percent full". However if apply that to (say) a single-disk 640 GB dataset where some of the files most commonly used (by VirtualBox) are larger than 15 GB each, then I guess that blocks for those files will become sub optimally spread across the platters with around 26 GB free.

I read that in most cases, fragmentation and defragmentation should not be a concern with ZFS. Sill, I like the mental picture of most fragments of a large .vdi in reasonably close proximity to each other. (Do features of ZFS make that wish for proximity too old-fashioned?)

Side note: there might arise the question of how to optimise performance (for massive files in a dataset with relatively little free space) after a threshold is 'broken'. If it arises, I'll keep it separate.

Background

On a 640 GB StoreJet Transcend (product ID 0x2329) in the past I probably went beyond an advisable threshold. Currently the largest file is around 17 GB –

– and I doubt that any .vdi or other file on this disk will grow beyond 40 GB. (Ignore the purple masses, those are bundles of 8 MB band files.)

Without HFS Plus: the thresholds of twenty, ten and five percent that I associate with Mobile Time Machine file system need not apply.

I currently use ZEVO Community Edition 1.1.1 with Mountain Lion, OS X 10.8.2, but I'd like answers to be not too version-specific.

References, chronological order

… So to solve this problem, what went in 2010/Q1 software release is
multifold. The most important thing is: we increased the threshold at
which we switched from 'first fit' (go fast) to 'best fit' (pack
tight) from 70% full to 96% full. With TB drives, each slab is at
least 5GB and 4% is still 200MB plenty of space and no need to do
anything radical before that. This gave us the biggest bang. Second,
instead of trying to reuse the same primary slabs until it failed an
allocation we decided to stop giving the primary slab this
preferential threatment as soon as the biggest allocation that could
be satisfied by a slab was down to 128K
(metaslab_df_alloc_threshold). At that point we were ready to switch
to another slab that had more free space. We also decided to reduce
the SMO bonus. Before, a slab that was 50% empty was preferred over
slabs that had never been used. In order to foster more write
aggregation, we reduced the threshold to 33% empty. This means that a
random write workload now spread to more slabs where each one will
have larger amount of free space leading to more write aggregation.
Finally we also saw that slab loading was contributing to lower
performance and implemented a slab prefetch mechanism to reduce down
time associated with that operation.

The conjunction of all these changes lead to 50% improved OLTP and 70%
reduced variability from run to run …

… As a rule of thumb, don't let your pool become more full than about
80% of its capacity. Once it reaches that point, you should start
adding more disks so ZFS has enough free blocks to choose from in
sequential write order.

Around eighty-five percent full (fifteen percent free)

… four percent free? … That seems … a little close to the edge. We try to aim for about eighty-five percent full before we start thinking about expanding capacity or doing something to relieve that pressure … we're pretty conservative …

Yeah, if you tried to do this on a system that was ninety-six percent full, you'd probably run out of space before you got done whatever you were doing … because the space would accumulate; and having that snapshot present would hold on to data that would otherwise be freed back to the pool from the normal activity …

… And performance would suck. Because ZFS works on a slab allocator … if you get really full, you start having to spend extra time finding places to fit different sizes of things, and it gets really slow.