We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses.

Number of CPU cores is just one of several hardware constraints on the number of tasks that can be run efficiently at the same time. Other constraints:

- Usually 1 to 2 map tasks per physical disk - Leave half the memory of the machine for the buffer cache and other things, and note that the task memory might be twice the maximum heap size. I'd say 4 GB/core is minimum, 8-12 GB/core would be better. - With 32 cores you need at least 10 GbE networking

Jeff

----- Original Message -----

From: "Guang Yang" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: "Peter Sheridan" <[EMAIL PROTECTED]>, "Jim Brooks" <[EMAIL PROTECTED]> Sent: Friday, December 14, 2012 3:03:54 PM Subject: question of how to take full advantage of cluster resources Hi, We have a beefy Hadoop cluster with 12 worker nodes and each one with 32 cores. We have been running Map/reduce jobs on this cluster and we noticed that if we configure the Map/Reduce capacity in the cluster to be less than the available processors in the cluster (32 x 12 = 384), say 216 map slots and 144 reduce slots (360 total), the jobs run okay. But if we configure the total Map/Reduce capacity to be more than 384, we observe that sometimes job runs unusual long and the symptom is that certain tasks (usually map tasks) are stuck in "initializing" stage for a long time on certain nodes, before get processed. The nodes exhibiting this behavior are random and not tied to specific boxes. Isn't the general rule of thumb of configuring M/R capacity to be twice the number of processors in the cluster? What do people usually do to try to maximize the usage of the cluster resources in term of cluster capacity configuration? I'd appreciate any responses. Thanks, Guang Yang

On Sat, Dec 15, 2012 at 4:33 AM, Guang Yang <[EMAIL PROTECTED]> wrote:> Hi,>> We have a beefy Hadoop cluster with 12 worker nodes and each one with 32> cores. We have been running Map/reduce jobs on this cluster and we noticed> that if we configure the Map/Reduce capacity in the cluster to be less than> the available processors in the cluster (32 x 12 = 384), say 216 map slots> and 144 reduce slots (360 total), the jobs run okay. But if we configure the> total Map/Reduce capacity to be more than 384, we observe that sometimes job> runs unusual long and the symptom is that certain tasks (usually map tasks)> are stuck in "initializing" stage for a long time on certain nodes, before> get processed. The nodes exhibiting this behavior are random and not tied to> specific boxes. Isn't the general rule of thumb of configuring M/R capacity> to be twice the number of processors in the cluster? What do people usually> do to try to maximize the usage of the cluster resources in term of cluster> capacity configuration? I'd appreciate any responses.>> Thanks,> Guang Yang

-- Harsh J

+

Harsh J 2012-12-14, 23:12

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext