Technology

Search This Blog

Enabling rack awareness for Hadoop cluster

In this article , We will learn how to enable rack awareness in hadoop clusters. Assume that cluster has large number of nodes and nodes are placed in more than one rack. If we enable rack awareness , all replicas of block will not be stored in one rack so that we can have at least one replica of block is available for data processing in case of rack failures.

2) Enabling rack awareness using Apache Ambari
Now we are going to see how to enable rack awareness using Apache Ambari . We have a five node cluster and by default we have got all nodes in default-rack.

Now we will modify rack for datanode3.go to --> hosts in ambari -----> click on host where you want to modify rack------>go to host actions -----> click set rack

Modify rack name to rack-1 and click OK.

Go back to hosts page in Ambari to see rack name for datanode3 is changed.

You can see that nodes are placed in two different racks they are default-rack and rack-1.

3) Confirm rack awareness enabled
We can also confirm from fsck command and also from hdfs dfsadmin -report commands.

The picture below is the output of command hdfs fsck / and it shows number of racks is 2.