GlusterFS – Configuration, Performance, and Redundancy

GlusterFS has been a growing distributed filesystem, and is now a part of Red Hat Storage Server. GlusterFS, also with Ceph (not covered in this article), have created a software based filesystem free of metadata. By using a hash algorithm distributed to the clients, this has removed the bottleneck of searching metadata for every file that we are attempting to read. This also allows for easy replication, even to nodes outside of the current infrastructure. As GlusterFS runs in user space, there is no need to worry about kernel updates or changes like older distributed filesystems.

With GlusterFS being open source, speeds are as fast as or faster than most enterprise solutions, there is substantial ease of use, and benefits include a cloud-based storage approach that runs on commodity hardware. This makes the solution appealing to startups, non-profits or those running a mostly cloud-based infrastructure. Another bonus is that files are stored in EXT3/4. There is no need for migrations to new versions of Gluster. You can simply strip away GlusterFS without the need to convert filesystem’s if Gluster is not working for you.

Below is a quick tutorial on a small, 2-node GlusterFS filesystem to provide redundancy for two load balanced Apache servers. We will be running Ubuntu Server 14.0.3. This will act as a mirrored RAID NAS device.

Clients: websrv01, websrv02

GlusterFS: Gluster01, Gluster02

We are going to install our Gluster PPA (personal package archive) on all 4 of our Linux servers, which will allow us to pull down our packages.

Now that both nodes (these are also known as bricks) are trusted, we can begin creating our Storage Volume. As we are using GlusterFS to create a redundant volume for our web servers, we will be using the replica option, not stripe. Here we specify the volume name, the number of servers we are mirroring to, our transport method which is TCP and the paths which the volume will be stored.

Since the Hash Algorithm is sent to the client, in the event we lose Gluster02 our client will reach out to Gluster01 for the file, regardless of the mount. I will add some test data in websrv01 and see if we can see those files on websrv02.

Already, in a matter of minutes we have a redundant, fast and flexible storage pool for our web servers. Now let us make sure no other GlusterFS nodes connect to this volume, as the current setup will allow for any server to connect to this volume. We only do this from 1 storage node, as this will replicate to the other. Note: Make sure you have a functional DNS or add the entries manually or use IP addresses. I ended up adding the hosts to /etc/hosts once my DNS decided to stop resolving for no reason.

Gluster attempted to keep it as simple as possible, and I would be hard pressed to find another solution that makes it this simple. Now, playing devil’s advocate, let’s say we need to account for added traffic, or we have added a new database to this filesystem. We need to quickly add another node to the cluster to support our new load. We quickly spin up a new VM, install GlusterFS Server package and begin to scale the cluster.

With Gluster communicating over TCP we can create a Disaster Recovery solution just as simply as we did here. We are able to move private clouds to public clouds or create redundancy between AWS sites.

Final Thoughts:It’s fun to see the next generation filesystems in action. The ability to stand up this type of solution on commodity hardware is a big plus for startups or those looking to cut costs. We are already seeing GlusterFS in major companies like Pandora, which run solely a GlusterFS environment.

Figures I last saw as far as workload were 75 million users listening to 13 million audio files scaling to petabytes with a network throughput of 50GB/s. Other companies reports running GlusterFS environments into the double digit petabytes. Currently these solutions I believe are mostly focuses toward Cloud and Business continuity applications. As it stands GlusterFS is battling against Ceph (Canonical/Inktank) in the Hash Algorithm Distributed filesystem department. Time will tell who will make it out on top as the fight to deeply integrate with OpenStack continues. Also it seems that these types of filesystems are impacted by Network and I/O latency on small file performance.

In a future blog article I will cover load balancing our apache servers and wrap up a completely redundant small web application.