Hadoop 3.0 Installation

This blog, Apache Hadoop 3.0 Installation, will assist you install and verify your pseudo-distributed, single-node & distributed instance in UNIX box (RHLE or Ubuntu). Hadoop 3.0 needs java 1.8 and higher, so this article assumes that you already aware of Hadoop 3.0 features and enhancement and minimum jdk requirement for newer version of Hadoop. Distributed & cluster Hadoop 3.0 installation need password less ssh communication between cluster nodes and I will also cover them in detail.

Lets look at the below diagram which depicts 2 scenarios, one is pseudo-distributed, single-node and other is distributed instance. You can skip the ssh step if you are trying to install it in single node machine.

Hadoop 3.0 Installation Cluster Hardware

Enable SSH in all nodes in cluster for Hadoop 3.0 Installation

Each node in cluster should be communicate to each other without seeking for authentication and to make it happens, passwordless SSH remote login is required.

Start Hadoop 3.0 Installation Services

NOTE: This command execution will only be done on your first hadoop installation. Do not perform this on existing hadoop installation, else it will permanently erase all your data from HDFS file system.

The first step to starting up your Hadoop 3.0 installation is formatting the Hadoop filesystem followed by hdfs service as shown below

#only perform after fresh and first installation
bin/hdfs namenode -format
#Now start the hdfs service using dfs command
sbin/start-dfs.sh
#it may give an error at the time of startup and then use following command
echo "ssh" | sudo tee /etc/pdsh/rcmd_default
#now start the yarn services
sbin/start-yarn.sh
#Once it is started successfully, check how many daemons are running
jps
2961 ResourceManager
2482 DataNode
3077 NodeManager
2366 NameNode
2686 SecondaryNameNode
3199 Jps

Congratulation, your installation is completed and you are ready to run MapReduce programs.

Hadoop 3.0 Installation Health Check

Previous version of Hadoop 2.x, web UI port is 50070 and it has been moved to 9870 in Hadoop 3.0. It can be accessed via web UI from localhost:9870

Hadoop 3.0 Downstream Compatibility

Following are the version compatibility matrix sheet indication the version of different Apache projects and their unit test status including basic functionality testing. This was done as part of Hadoop 3.0 Beta 1 release in Oct 2017.