> I can successfully run the MPI testcode via OpenMPI 1.3.3 on less than 87 slots w/both the btl_tcp_if_exclude and btl_tcp_if_include switches
> passed to mpirun.
>
> SGE always allocates the qsub jobs from the 24 slot nodes first -- up to the 96 slots that these 4 nodes have available (on the largeMem.q). The rest of the 602 slots are allocated
> from 2 slot nodes (all.q). All requests of up to 96 slots are serviced by the largeMem.q nodes (which have 24 slots apiece). Anything over 96 slots is serviced first by the largeMem.q
> nodes then by the all.q nodes.

did you setup a JSV for it, as PE have no sequence numbers in case a PE is requested?

These both above can be set to NONE when you compiled Open MPI with SGE integration --with-sge

NB: what is defined in rsh_daemon/rsh_command in `qconf -sconf`?

> allocation_rule $fill_up

Here you specify to fill one machine after the other completely before gathering slots from the next machine. You can change this to $round_robin to get one slot form each node before taking a second from particular machines. If you prefer a fixed allocation, you could also put an integer here.

> control_slaves TRUE
> job_is_first_task FALSE
> urgency_slots min
> accounting_summary TRUE
>
> Wouldn't the -bynode allocation be really inefficient? Does the -bynode switch imply only one slot is used on each node before it moves on to the next?

Do I get it right: inside the granted slots by SGE you want the allocation inside Open MPI to follow a specific pattern, i.e.: which rank is where?