ZFS Disk Space Accounting

ZFS is based on the concept of pooled storage. Unlike typical file
systems, which are mapped to physical storage, all ZFS file systems in a
pool share the available storage in the pool. So, the available disk space
reported by utilities such as df might change even when the file system is
inactive, as other file systems in the pool consume or release disk
space.

Note that the maximum file system size can be limited by using
quotas. For information about quotas, see Setting Quotas on ZFS File Systems. A specified amount of disk space
can be guaranteed to a file system by using reservations. For information about
reservations, see Setting Reservations on ZFS File Systems. This model is very similar to the NFS model, where multiple
directories are mounted from the same file system (consider /home).

All metadata in ZFS is allocated dynamically. Most other file systems preallocate much
of their metadata. As a result, at file system creation time, an immediate
space cost for this metadata is required. This behavior also means that the
total number of files supported by the file systems is predetermined. Because ZFS
allocates its metadata as it needs it, no initial space cost is required,
and the number of files is limited only by the available disk
space. The output from the df -g command must be interpreted differently for ZFS than
other file systems. The total files reported is only an estimate based on the
amount of storage that is available in the pool.

ZFS is a transactional file system. Most file system modifications are bundled into
transaction groups and committed to disk asynchronously. Until these modifications are committed to
disk, they are called pending changes. The amount of disk space used, available,
and referenced by a file or file system does not consider pending changes.
Pending changes are generally accounted for within a few seconds. Even committing a
change to disk by using fsync(3c) or O_SYNC does not necessarily guarantee that the
disk space usage information is updated immediately.

On a UFS file system, the du command reports the size of the
data blocks within the file. On a ZFS file system, du reports
the actual size of the file as stored on disk. This size
includes metadata as well as compression. This reporting really helps answer the question of
"how much more space will I get if I remove this file?"
So, even when compression is off, you will still see different results between
ZFS and UFS.

When you compare the space consumption that is reported by the df
command with the zfs list command, consider that df is reporting the pool size and
not just file system sizes. In addition, df doesn't understand descendent file
systems or whether snapshots exist. If any ZFS properties, such as compression and
quotas, are set on file systems, reconciling the space consumption that is reported
by df might be difficult.

Consider the following scenarios that might also impact reported space consumption:

For files that are larger than recordsize, the last block of the file is generally about 1/2 full. With the default recordsize set to 128 KB, approximately 64 KB is wasted per file, which might be a large impact. The integration of RFE 6812608 would resolve this scenario. You can work around this by enabling compression. Even if your data is already compressed, the unused portion of the last block will be zero-filled, and compresses very well.

On a RAIDZ-2 pool, every block consumes at least 2 sectors (512-byte chunks) of parity information. The space consumed by the parity information is not reported, but because it can vary, and be a much larger percentage for small blocks, an impact to space reporting might be seen. The impact is more extreme for a recordsize set to 512 bytes, where each 512-byte logical block consumes 1.5 KB (3 times the space). Regardless of the data being stored, if space efficiency is your primary concern, you should leave the recordsize at the default (128 KB), and enable compression (to the default of lzjb).

The presence of snapshots can cause some unexpected behavior when you attempt to
free disk space. Typically, given appropriate permissions, you can remove a file from
a full file system, and this action results in more disk space becoming
available in the file system. However, if the file to be removed exists
in a snapshot of the file system, then no disk space is gained
from the file deletion. The blocks used by the file continue to
be referenced from the snapshot.

As a result, the file deletion can consume more disk space because
a new version of the directory needs to be created to reflect the
new state of the namespace. This behavior means that you can receive an
unexpected ENOSPC or EDQUOT error when attempting to remove a file.