Data Deduplication with Linux

Shifting focus back to lessfs preparation, note
that the lessfs volume's options can be defined by the user when
mounting. For instance, you can define the desired options for big_write,
max_read and max_write. The big_write improves throughput when used
for backup purposes, and both max_read and max_write must be defined to
use it. The max_read and max_write options always must be equal to one
another and define the block size for lessfs to use: 4, 8, 16, 32, 64
and 128KB.

The definition of a block size can be used to tune the
filesystem. For example, a larger block size, such as 128KB (131072),
offers faster performance but, unfortunately, at the cost of
less deduplication (remember from earlier that lessfs uses block-level
deduplication). All other options are FUSE-generic options defined in
the FUSE documentation. An example of the use of supported mount options
can be found in the lessfs man page:

$ man 1 lessfs

The following example is given to mount lessfs with a 128KB
block size:

Additional configurable options for the database exist in your
lessfs.cfg file (the same file you copied over to the /etc directory
path earlier). The block size can be defined here as well as even the method
of additional data compression to use on the deduplicated data and
more. Below is an excerpt of what the configuration file contains. In
order to define a new value for various options clearly, just uncomment
the option desired and, in turn, comment everything else:

This excerpt defines the default block size to 128KB and the default
compression method to QuickLZ. If the defaults are not to your liking,
in this file you also can define the commit to disk intervals (default is
30 seconds) or a new path for your databases, but make sure to initialize
the databases before use; otherwise, you'll get an error when you try
to mount the lessfs filesystem.

Summary

Now, Linux is not limited to a single data deduplication solution. There
also is SDFS, a file-level deduplication filesystem that also runs
on the FUSE module. SDFS is a freely available cross-platform solution
(Linux and Windows) made available by the Opendedup Project. On
its official Web site, the project highlights the filesystem's
scalability (it can dedup a petabyte or more of data); speed, performing
deduplication/reduplication at a line speed of 290MB/s and higher; support
for VMware while also mentioning its usage in Xen and KVM; flexibility
in storage, as deduplicated data can be stored locally, on the network
across multiple nodes (NFS/CIFS and iSCSI), or in the cloud; inline
and batch mode deduplication (a method of post-process deduplication);
and file and folder snapshot support. The project seems to be pushing
itself as an enterprise-class solution, and with features like these,
Opendedup means business.

It is also not surprising that since 2008, data deduplication has
been a requested feature for Btrfs, the next-generation Linux filesystem.
Although that also may be in response to Sun Microsystem's
(now Oracle's) development of data deduplication into its advanced ZFS
filesystem. Unfortunately, at this point in time, it is unknown if
and when Btrfs will introduce data deduplication support, although it
already contains support for various types of
data compression (such as zlib and LZO).

Currently, the lessfs2 release is under development, and it is supposed
to introduce snapshot support, fast inode cloning, new databases
(including hamsterdb and possibly BerkeleyDB) apart from tokyocabinet,
self-healing RAID (to repair corrupted chunks) and more.

As you can see, with a little time and effort, it is relatively simple
to utilize the recent trend of data deduplication to reduce the total
capacity consumed on a storage volume by removing all redundant copies
of data. I recommend its usage in not only server administration
but even for personal use, primarily because with implementations such
as lessfs, even if there isn't too much redundant data, the additional
data compression will help reduce the total size of the file when it is
eventually written to disk. It is also worth mentioning that the
lessfs-enabled volume does not need to remain local to the host system, but
it
also can be exported across a network via NFS to even iSCSI and utilized
by other devices within that same network, providing a more flexible
solution.

Petros Koutoupis is a full-time Linux kernel, device-driver and
application developer for embedded and server platforms. He has been
working in the data storage industry for more than six years and enjoys discussing the same technologies.

Comment viewing options

Nice article. I work in an academic lab where we crunch massive amounts of data, and storage is always a huge headache for us. In the past we've had access to HSM storage management solutions, but the slowest tier has always been tape. It turns out that getting your data back from tape takes longer in some cases than just recomputing it, which already takes weeks on HPCs. It seems to me that if you could create HSM type solution with a fast parallel file system, like lustre, as the fastest storage tier and a compressed, deduplicated file system on slower, cheaper magnetic disks you might have a more reasonable, cost effecctive storage system for HPC. (I have not run any numbers though, an I'm not sure wahether yoou could build a system like this with OTS software/hardware.)

If you want to take advantage of de-duplication in your basement or development lab for your virtual machines you could consider using SmartOS as the underlying hypervisor platform. It comes with KVM as the hypervisor and ZFS as the filesystem. To enable de-dupe in ZFS it is simply: "zfs set dedup=on pool/filesystem", plus all the other awesome features of ZFS. Instant snapshots, clones, compression, etc. Then you can run your favorite GNU/Linux platform on top of it with de-duplication happening under the hypervisor. This ZFS de-duplication is all open-source and hails from the Illumos kernel.