18845 Internet Services
Alexander Loria
Piyush Sharma
MapReduce on a Heterogeneous Environment
In a heterogeneous environment, a MapReduce job must be appropriately
load balanced among the physical nodes. Nodes that have less available
resources should be given less work than those that have more free
resources. In our implementation of MapReduce, we intend to promote
this load- balancing scheme: the master will assign data and tasks to
physical nodes in proportion to the s free memory, disk space, and
other relevant resources.
The availability of resources in any particular node is likely to
fluctuate over time. We account for this by having the node frequently
update the master about its resource utilization. The master can then
make new decisions about load balancing the job. For example, the
master can decide to reassign data to other nodes after realizing that
a node, to which the data was assigned previously, is now overloaded.
It can also ask the other nodes to spawn new tasks if needed.