Configure node labels on YARN

In this post, we will see how to configure node labels on YARN. I work for Hortonworks so obviously we will configure it for HDP 😉

Before we go for the configuration part, let’s understand what is node label in YARN.

Node labels allows us to divide our cluster in different parts and we can use those parts individually as per our requirements. More specifically, we can create a group of node-managers using node labels, for example group of node managers which are having high amount of RAM and use them to process only critical production jobs! This is cool, isn’t it? So lets see how we can configure node labels on YARN.

Types of node labels:

Exclusive – In this type of node labels, only associated/mapped queues can access the resources of node label.

Non Exclusive(sharable) – If resources are not in use for this node label then it can be shared with other running applications in a cluster.

Note – Don’t worry about port if you have only one node manager running per host.

Step 6: Map node labels to the queues:

I have created 2 queues ‘a’ and ‘b’ in such a way that, queue ‘a’ can access nodes with label ‘x’ and ‘y’ where queue ‘b’ can only access the nodes with label ‘y’. By default, all the queues can access nodes with ‘default’ label.

How non exclusive node label works?

We have node label ‘y’ as non exclusive, lets keep resources under node label ‘y’ idle and try to submit job in a default node label and see if it hires resources from ‘y’. Interesting! isn’t it? well, that’s how we learn

State before job submission:

Note that memory used and running containers are 0 in below screenshot, that explains – ‘resources are idle’.