Unfortunately, the "uninitialized values" kinds of warnings from valgrind are to be expected when using the OFED stack. Specifically, a bunch of memory in an OMPI process comes directly from OS-bypass kinds of mechanisms, which effectively translates into valgrind-bypass, too. Hence, even though the memory *has* been initialized, valgrind didn't "see" it get initialized, so it complains. :-\

Running with TCP should give much more predictable valgrind results, but there are still some tolerable valgrind warnings that we don't care about. Specifically, when we write a struct down a file descriptor, sometimes there's an alignment "hole" (e.g., a 2 byte short followed by a 2 byte hole followed by a 4 byte int) that wasn't initialized. We don't care if such holes are uninitialized.

You said that the program runs correctly with TCP but not with openib. That could well be explained if there is some subtle memory bug somewhere; the openib and TCP underlying drivers are quite different from each other. It is very possible that openib interacts in such a way that causes the real bug to be fatal, but TCP interacts with it in a different way that does not cause it to be fatal.