Details

Description

We had a balancer that had not made any progress for a long time. It turned out it was repeatingly asking Namenode for a partial block list of one datanode, which was done while the balancer was running.

NameNode should notify Balancer that the datanode is not available and Balancer should stop asking for the datanode's block list.

Attaching the patch. Unfortunately I do not have a way to reproduce the issue so I'm unable to have a test to verify the change.
Here is the explanation of the part of the Balancer code makes it hang forever.

In the following while loop in Balancer.java, when the Balancer figures out that it should fetch more blocks, it gets the BlockList and decrements the blockToReceive by that many blocks. It again starts from the top of the loop after that.

The problem here is, if the datanode is decommissioned, the getBlockList() method will not return anything and the blocksToReceive will not be changed. It will keep on doing this indefinitely as the blocksToReceive will always be greater than 0. The isTimeUp will never be set to true as it will never reach that part of the code. In the patch that is submitted, the Time up condition is moved to the top of the loop. So it will check if isTimeUp is set and proceed ahead only if time up is not encountered.

Mit Desai
added a comment - 10/Jun/14 15:57 Attaching the patch. Unfortunately I do not have a way to reproduce the issue so I'm unable to have a test to verify the change.
Here is the explanation of the part of the Balancer code makes it hang forever.
In the following while loop in Balancer.java, when the Balancer figures out that it should fetch more blocks, it gets the BlockList and decrements the blockToReceive by that many blocks. It again starts from the top of the loop after that.
while (!isTimeUp && getScheduledSize()>0 &&
(!srcBlockList.isEmpty() || blocksToReceive>0)) {
## SOME LINES OMITTED ##
filterMovedBlocks(); // filter already moved blocks
if (shouldFetchMoreBlocks()) {
// fetch new blocks
try {
blocksToReceive -= getBlockList();
continue ;
} catch (IOException e) {
## SOME LINES OMITTED ##
// check if time is up or not
if (Time.now()-startTime > MAX_ITERATION_TIME) {
isTimeUp = true ;
continue ;
}
## SOME LINES OMITTED ##
}
The problem here is, if the datanode is decommissioned, the getBlockList() method will not return anything and the blocksToReceive will not be changed. It will keep on doing this indefinitely as the blocksToReceive will always be greater than 0. The isTimeUp will never be set to true as it will never reach that part of the code. In the patch that is submitted, the Time up condition is moved to the top of the loop. So it will check if isTimeUp is set and proceed ahead only if time up is not encountered.