One of the most attractive features of Hadoop framework is its utilization of commodity hardware. However, this leads to frequent DataNode crashes in a Hadoop cluster. Another striking feature of Hadoop Framework is the ease of scale in accordance to the rapid growth in data volume. Because of these two reasons, one of the most common task of a Hadoop administrator is to commission (Add) and decommission (Remove) Data Nodes in a Hadoop Cluster.

Commissioning and Decommissioning Nodes in a Hadoop Cluster:

Above diagram shows a step by step process to decommission a DataNode in the cluster.

The first task is to update the ‘exclude‘ files for both HDFS (hdfs-site.xml) and MapReduce (mapred-site.xml).

The ‘exclude’ file:

for jobtracker contains the list of hosts that should be excluded by the jobtracker. If the value is empty, no hosts are excluded.

for Namenode contains a list of hosts that are not permitted to connect to the Namenode.

Here is the sample configuration for the exclude file in hdfs-site.xml and mapred-site.xml: