FUSE-DFS

In order to allow existing software to access the Hadoop Distributed File System (HDFS) without modification, I have compiled and installed FUSE-DFS on my cluster. FUSE-DFS allows us to use FUSE (Files System in Userspace) to mount the HDFS as a local filesystem. Software can then access the contents of the HDFS in the same way that files on the local filesystem are accessed.

Since I am using the standard version of Hadoop (from hadoop.apache.org), rather than a distribution from Cloudera or another company, I had to compile and configure the filesystem myself. I ran into several issues along the way, so I thought that I should share my solution to some of the more difficult problems.

I began by reading a wiki page about Mountable HDFS. I had already downloaded the source for Hadoop 2.4.1, so I began attempting to compile the version of fuse_dfs that came included with the download. Upon trying to follow directions to compile fuse_dfs, I found that the directory structure in the instructions differed from the directory structure of the source taball that I downloaded. After spending some time attempting to adjust the instructions to apply to my source, I decided to compile the code manually. If I had more knowledge of cmake, I probably would have been able to use cmake to build it, but I don't know very much about cmake yet.

The source for fuse_dfs was located at hadoop-2.4.1-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs. I created a build directory in hadoop-2.4.1-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/ and then compiled all of the source files with...

where /usr/local/hadoop/lib/native/ is the location of libhdfs.so and /usr/lib/jvm/java-7-oracle/jre/lib/amd64/server/ is the location of libjvm.so. You may also need to make a link to Hadoop's "config.h" in the fuse-dfs directory or do something else so that the preprocessor can locate config.h.

When I first attempted this, the version of libhdfs.so installed on my system was apparently a 32-bit executable, so it could not be linked with fuse_dfs. I compiled libhdfs.so manually as well:

Once this was all finished, I installed fuse_dfs and fuse_dfs_wrapper.sh in /usr/local/hadoop/bin/ where all of the other hadoop-related executables are located. Upon trying to mount my HDFS, I encountered errors telling me that certain .jar files could not be found and that CLASSPATH was not defined. The command

$ hadoop classpath

$ hadoop classpath

prints the relevant CLASSPATH, but the CLASSPATH that is actually needed is an explicit listing of all of the .jar files---not just the list of directories (note that the system does not understand the meaning of the wildcard, *). In order to make the list of .jar files, I built a command with awk, sed, ls, and sh and then set the CLASSPATH environment variable to the result of that command. This can probably be done with a shorter command, but this works:

This command ignores one path—the path to Hadoop's configuration .xml files, which is /usr/local/hadoop/etc/hadoop/, in my case. So I add this directory as follows:

export CLASSPATH=/usr/local/hadoop/etc/hadoop/:$CLASSPATH

export CLASSPATH=/usr/local/hadoop/etc/hadoop/:$CLASSPATH

This CLASSPATH definition is inserted into my .bashrc file on all of the nodes. At this point, I was still unable to mount the drive because I did not have the proper priviledges, so I added myself to the fuse group:

$ sudo adduser $USER fuse

$ sudo adduser $USER fuse

Then, I had to uncomment the following line in /etc/fuse.conf:

user_allow_other

user_allow_other

Finally, I was able to mount the filesystem:

$ fuse_dfs_wrapper.sh -d dfs://foam:8020 dfsmount/

$ fuse_dfs_wrapper.sh -d dfs://foam:8020 dfsmount/

Where "foam" is the hostname of the NameNode and dfsmount is the mountpoint. Here it is in action:

This entry was posted
on Thursday, August 14th, 2014 at 4:08 pm and is filed under regular update.
You can follow any responses to this entry through the RSS 2.0 feed.
You can skip to the end and leave a response. Pinging is currently not allowed.

Education help to gain the suitable employment according to the ability level of a person and education is the guideline for us to select the appropriate career choice which makes our future more secure.

This truly is this kind of incredible asset that you're supplying furthermore you give away complimentary. I truly like seeing sites that comprehend the requirement for supplying a superb asset complimentary. Much obliged to you for this great asset. Thanks for the article.

I don't have any way to measure the promotional value of the trailer, but it has given me just one more opportunity to remind people that the book is out there and present its themes in an entertaining way.

I am speculating this is for a vinyl air bed. Assuming this is the case, if the needs a lot for another board, Coleman makes a decent inflation, included with their airbeds, yet I think accessible independently for $ or somewhere in the vicinity, and you could most likely kluge it into spot, however you may need to remote the switch.

Are you looking for Big Data Hadoop training classes where Big Data is fastest growing and most promising technology for handling large amounts of data for doing data analytics. Hadoop training in bay area This Big Data Hadoop training course lets you master hadoop technology. You will gain proficiency in learning in-depth knowledge on Big data and Hadoop Modules. You will learn the most important skills needed to work with hadoop data sources for data mining to get deep understanding of valuable business awareness.

Big Data Organizations use their data to support and influence decisions and build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term ‘data science’. In big data training in bay area you will learn who is best suited to attend the full training, what prior knowledge you should have, and how the skills will help you (do/complete) business-critical analyses using Big Data in Hadoop.

Setting and Achieving Goals Essay. A person needs to achieve certain goals in one's life before you can call them successful. Success is to achieve goals, you have set. I have set certain goals I would like to achieve in my lifetime.