SUMMARY:

Hadoop is on its way to becomig the de facto platform for the next-generation of data-based applications, but it’s not without some flaws. Ironically, one of Hadoop’s biggest shortcomings right now is also one of its biggest strengths going forward — the Hadoop Distributed File System.

Hadoop is on its way to becoming the de facto platform for the next-generation of data-based applications, but it’s not without flaws. Ironically, one of Hadoop’s biggest shortcomings now is also one of its biggest strengths going forward — the Hadoop Distributed File System.

Within the Apache Software Foundation, HDFS is always improving in terms of performance and availability. Honestly, it’s probably fine for the majority of Hadoop workloads that are running in pilot projects, skunkworks projects or generally non-demanding environments. And technologies such as HBase that are built atop HDFS speak to its versatility as storage system even for non-MapReduce applications.

But if the growing number of options for replacing HDFS signifies anything, it’s that HDFS isn’t quite where it needs to be. Some Hadoop users have strict demands around performance, availability and enterprise-grade features, while others aren’t keen of its direct-attached storage (DAS) architecture. Concerns around availability might be especially valid for anyone (read “almost everyone”) who’s using an older version of Hadoop without the High Availability NameNode. Here are eight products and projects whose proprietors argue can deliver what HDFS can’t:

Cassandra (DataStax)

Not a file system at all but an open source, NoSQL key-value store, Cassandra has become a viable alternative to HDFS for web applications that rely on fast data access. DataStax, a startup commercializing the Cassandra database, has fused Hadoop atop Cassandra to provide web applications fast access to data processed by Hadoop, and Hadoop fast access to data streaming into Cassandra from web users.

Ceph

Ceph is an open source, multi-pronged storage system that was recently commercialized by a startup called Inktank. Among its features is a high-performance parallel file system that some think makes it a candidate for replacing HDFS (and then some) in Hadoop environments. Indeed, some researchers started looking at this possibility as far back as 2010.

Dispersed Storage Network (Cleversafe)

Cleversafe got into the HDFS-replacement business on Monday, announcing a product that will fuse Hadoop MapReduce with the company’s Dispersed Storage Network system. By fully distributing metadata across the cluster (instead of relying on a single NameNode) and not relying on replication, Cleversafe says it’s much faster, more reliable and scalable than HDFS.

GPFS (IBM)

IBM has been selling its General Parallel File System to high-performance computing customers for years (including within some of the world’s fastest supercomputers), and in 2010 it tuned GPFS for Hadoop. IBM claims the GPFS-SNC (Shared Nothing Cluster) edition is so much faster than Hadoop in part because it runs at the kernel level as opposed to atop the OS like HDFS.

Isilon (EMC)

EMC has offered its own Hadoop distributions for more than a year, but in January 2012 it unveiled a new method for making HDFS enterprise-class — replace it with EMC Isilon’s OneFS file system. Technically, as EMC’s Chuck Hollis explained at the time, because Isilon can read NFS, CIFS and HDFS protocols, a single Isilon NAS system can serve to intake, process and analyze data.

Lustre

Lustre is a an open source high-performance file system that some claim can make for an HDFS alternative where performance is a major concern. Truth be told, I haven’t heard of this combination running anywhere in the wild, but HPC storage provider Xyratex wrote a paper on the combination in 2011, claiming a Lustre-based cluster (even with InfiniBand) will be faster and cheaper than an HDFS-based cluster.

MapR File System

The MapR File System is probably the best-known HDFS alternative, as it’s the basis of MapR’s increasingly popular — and well-funded — Hadoop distribution. Not only does MapR claim its file system is two to five times faster than HDFS on average (although, really, up to 20 times faster), but it has features such as mirroring, snapshots and high availability that enterprise customers love.

NetApp Open Solution for Hadoop

OK, the NetApp Open Solution for Hadoop isn’t so much an HDFS replacement as it is an HDFS improvement, according to NetApp and early partner Cloudera. The offering still relies on HDFS, but it reenvisions the physical Hadoop architecture by putting HDFS on a RAID array. This, NetApp claims, means faster, more reliable and more secure Hadoop jobs.

This might be a good place to say rest in peace to two other HDFS alternatives that are effectively no longer with us — KosmosFS (aka CloudStore) and Appistry CloudIQ Storage. The former was created by Kosmix (since bought by @WalmartLabs) and released to the open source world in 2007, but no longer has an active community. The latter was an attempt by Appistry in 2010 to get a piece of the Hadoop pie with its computational storage technology, but the company has since switched its focus from selling the technology to providing high-performance computing services based on it.

Internet Memory supplies a service to browse archived Web pages, including multimedia content. We use Hadoop, HDFS and HBase for storing and indexing our data, and associates this storage with a Web server that lets users navigate through the archive and retrieve documents. In the present post, we focus on videos and detail the solution adopted to serve true streaming from HDFS storage.

Basics

Many video formats are found on the Web, including Windows Media (.wmv), RealMedia (.rm), Quicktime (.mov), MPEG, Adobe Flash (.flv), etc. In order to display a video, we need a player, which can be incorporated in the Web browser. The player depends on the specific video format, but most browsers are able to detect the format and choose the appropriate player. Firefox for instance comes with a lot of plugins, which can be quickly integrated in the presence of a specific video to display it content.

There are basically two ways to play a video. The simplest one is a two-steps process: first the whole file is downloaded from the Web server to the user’s computer, and then displayed by the player running the local copy. It has the disadvantage that the download step may take a while is the file is big (hundreds of megabytes are not uncommon). The second one uses (true) streaming: the video file is split into fragments which are sent from the Web server to the player, giving the illusion of a continuous stream. From the user point of view, it looks as if a window is swept over the video content, saving the need of a full initial download of the whole file.

Obviously, streaming is a more involved method because it requires a strong coordination between the components involved in the process, namely the player, the Web server, and the file system from which the video is retrieved. We examine this technical issue in the context of a Hadoop system where files are stored in HDFS, a file system dedicated to large distributed storage.

File seeking with HDFS

At explained above, streaming requires a strong coordination between the Web server and the file system. The former produces requests to access chunks of the video file (think to what happens when the user suddenly requires a move to a specific part of the video), whereas the later must be able to seek in the file to position the cursor at a specific location. When using HDFS, enabling such a close cooperation turns out to be a problem because HDFS can in principle only be accessed through a Hadoop client, which the standard Apache server is not. We investigated two possible solutions: Hoop, the Hadoop web server, and Apache/FUSE.

Hoop (see http:///cloudera.github.com/hoop/) is an HTTP-HDFS-Connector. It allows the HDFS file system to be accessed via HTTP. A working local prototype has been developed using JW Player and a large video file. Streaming works, but seeking in an unbuffered part results in the playback stopping. It seems that the Hoop API does not support seeking in a file, so we had to give up this approach.

The second solution is based on HDFS/FUSE. FUSE (File System in User Space) is an API that captures the file system operations and allows to implement them with ad-hoc functions running in the the user’s processus space (thereby saving the need to change the operating system kernel, a tricky and dangerous option). FUSE is provided in Hadoop as a component named “Mountable HDFS” (see http://wiki.apache.org/hadoop/MountableHDFS). It lets the standard file system user or program see the HDFS name space as a locally mounted directory. All file system operations, including directory browsing, file opening and content access, are enabled over HDFS content through the FUSE interface.

Apache server configuration

It remained to configure Apache to access the mounted FUSE system and load content from video files. How this is done depends on the video format. At the moment, we tested and validated .mp4 files and Flash video files. For the first format we use H264 Streaming Module (see http://h264.code-shop.com/trac), an Apache plugin, which enables adaptive streaming. For FLV we used pseudo-stream module for Apache named “mod_flv”. Both behave nicely and go along with the mountable HDFS without problem.

Conclusion

The solution based on Apache + Mountable HDFS (FUSE) turned out to be both reliable, functionally adequate (seeking is well supported) and efficient. The architecture is simple and easy to set up, and allows to combine the benefits of HDFS for very large repositories and standard Web server streaming solutions. Although we chose to adopt Apache plugins in our current service, nothing keeps you from using a more powerful streaming server since the FUSE approach (virtually) moves all the HDFS content in the standard file system scope.

Hoop remains a potential option for the future, but it appeared not mature enough when we tested it, at least for the complex operations (seeking at a specific offset in a file) required by video streaming.