>> $ mpirun -np 4 hwloc-bind socket:0.core:0-3 ./test
>>
>> 1. Does hwloc-bind map the processes *sequentially* on *successive* cores of the socket?
>
> No. Each hwloc-bind command in the mpirun above doesn't know that there are other hwloc-bind instances on the same machine. All of them bind their process to all cores in the first socket.

To further underscore this point, mpirun launched 4 copies of:

hwloc-bind socket:0.core:0-3 ./test

Which means that all 4 processes bound to exactly the same thing.

If you want each process to bind to a *different* set of PU's, then you have two choices:

1. See Open MPI 1.5.1's mpirun(1) man page. There's new affinity options in the OMPI 1.5 series, such as --bind-to-core and --bind-to-socket. We wrote them up in the FAQ, too.

2. Write a wrapper script that looks at the Open MPI environment variables OMPI_COMM_WORLD_RANK, or OMPI_COMM_WORLD_LOCAL_RANK, or OMPI_COMM_WORLD_NODE_RANK and decides how to invoke hwloc-bind. For example, something like this: