> No route to host at -e line 1.
>
> This is wrong, all the nodes are visible from all the other nodes on a
> private subnet. For example:

ok, fixed this. Turns out we have ipoib going, and one adapter needed
to be brought down and back up. Now the tcp version appears to be
running, though I do get the strange hangs after a random (never the
same) number of iterations.

Given that the hangs are random, and don't appear to happen at the same
time step but a similar place in the code, suggests to me that something
may be amiss in the MPI_Waitsome function. Possible a completion was
posted and due to buffer sizes, fell off the scoreboard.