On Oct 21, 2010, at 11:58 AM, Ken Nielson wrote:
> We would like to know what kind of GPU hardware is deployed in your data
> centers so we can better understand what we need to do to provide GPU
> support in TORQUE.
>> If you currently have GPUs deployed would you please reply and let us
> know the details of the hardware and any other information you think is
> relevant.
We have multiple flavors of GPUs with anywhere from 1 to 4 GPUs per node.
We manage the GPUs with resource tags on the nodes. Our users run parallel
within and between boxes.
I guess TORQUE could do better with the ppn, nodes, vnodes, tasks, etc,
but thats a pretty big undertaking. We have our nodes set up for exclusive
access and let them figure out the topology for their MPI jobs. Many use
OpenMPI which does not have a direct mapping to a $PBS_NODEFILE format
for GPU or noGPU jobs. I don't know how TORQUE could better support
different MPI topology layouts. The specification to TORQUE would be
just as difficult as doing it yourself with a script.
-mb
--
+-----------------------------------------------
| Michael Barnes
|| Thomas Jefferson National Accelerator Facility
| Scientific Computing Group
| 12000 Jefferson Ave.
| Newport News, VA 23606
| (757) 269-7634
+-----------------------------------------------