I would like you to ask for a topic that there are already many
questions but I am not familiar a lot with it. I want to understand the
behaviour of an application where there are many messages less than 64KB
(eager mode) and I use TCP network. I am trying to understand in order
to simulate this application.
For example it can be possible to have one MPI_Send of 1200 bytes after
some computation, then two messages of the same size, after computation,
etc. However according to the measurements and the profiling the cost of
the communication is less than the latency of the network. I can
understand that the cost of the MPI_Send is the copy to the buffer
however sending the message to the destination it should cost at least
the latency. So are the messages buffered in the sender and they are
sent as packet to the receiver? My tcp window is 4MB and I use the same
value for snd_buff and rcv_buff. If they are buffered in the sender what
is the criterion/algorithm? I mean if I have one message, after
computation and after again message is it possible these two messages to
be buffered from the sender point of view or this happens only on the
receiver? If there is any document/paper that I can read about this I
would be appreciate to provide me the link.
A simple example is that if I have a loop that rank 0 sends two messages
to rank 1 then the duration of the first message is bigger than the
second's one and if I increase the loop to 10 or 20 messages then all
the messages cost a lot less than the first one and also less from what
SkaMPI measures. So I am sure that it should be a buffer issue (or
something else that I can't think about).