Analysis of the AllToAllV in cluster of cluster environments (Networking) [in progress]Description: The context of this work is the total exchange of data between processes
running in different clusters linked by a wide-area link (backbone). We
can experiment such an environment using several sites of the Grid5000
testbed.
Starting from the work done on the AllToAll operation we have
extended the study to the AllToAllV operation. AllToAll in the MPI
terminology means a total exchange of pieces of data of the same size
between participant processes. AllToAllV implies that each piece of data
may be of a different size. Unlike AllToAll, it is difficult to choose
a good routing algorithm (e.g binomial tree, Bruck algorithm) depending on
the data size for AllToAllV since the processes do not know the amount of data
sent by the others. MPI implementations (OpenMPI, MPICH-2, GridMPI)
have today a very simple implementation for AllToAllV, in which all
processes send and receive asynchronously their data to all other processes
and wait for the completion of all communications.
Yet simple, this strategy performs better than any optimized routing scheme
used in the other collective operations.
We have tried more sophitiscated approaches based on message aggregation,
congestion regulation and load-balance. In this kind of strategy, we select
forwarders processes at each cluster, whose task is to aggregate single
messages into a larger one sent over the backbone. The number of forwarders
allows to control how many TCP streams simultaneously compete for the backbone.
We can balance the size of the aggregated messages with an extra step at the
begining where process first exchange the information about how much data
they send.
However, our proposal have only equaled the original AllToAllV
implementation so far, bringing no significant improvement. One of the
key finding of this work is that message aggregation does not improve the
overall throughtput of the streams over the wide area link.
Work is currently undergoing to model the behavior of the various routing
strategies in the PLogP model.Results: