I have 32 GPUs in 8 servers and I'm trying to generate a few benchmarks. I'm running a crack using 16 GPUs against a WPA handshake that I had laying around, but it appears that my GPUs are being underutilized:

no, those speeds are not normal for a 7970. a single 7970 should be able to pull about 130 kh/s. but, those speeds are probably to be expected with the amount of work you've given it. rockyou + base64 is hardly any work, especially for that many devices.

it would also be helpful if you told us about your cluster architecture, broker node specs, network topology, etc.

(05-06-2013, 03:10 AM)epixoip Wrote: status doesn't work, you just have to live with it.

no, those speeds are not normal for a 7970. a single 7970 should be able to pull about 130 kh/s. but, those speeds are probably to be expected with the amount of work you've given it. rockyou + base64 is hardly any work, especially for that many devices.

it would also be helpful if you told us about your cluster architecture, broker node specs, network topology, etc.

The broker is a dedicated server with a 3Ghz E5450 Xeon with 16GB of RAM and a quad-port Intel Gb ethernet card. Each port is connected to a separate Gb switch. Two of the compute nodes, each with four GPUs, are connected to each switch.

I'm hoping it's not a network bandwidth issue... Infiniband isn't an easy option with the motherboard in the compute nodes.

broker specs are solid, although more ram may be needed if you ever decide to use all 32 GPUs. you won't be able to use very high -n values with that "little" ram.

you're on ethernet, so you're guaranteed to be network-bound. but your network issues should mostly be latency related, not bandwidth related. infiniband latencies are 1/100th of ethernet latencies.

this should be easy to test: just throw a ton of work at the cards and monitor the bandwidth. if your pipes are saturated and you're not achieving full acceleration, then bandwidth is the problem. if your pipes aren't saturated but you're still not achieving full acceleration, then latency is the problem.

test with different algorithms, different attack modes, etc. you will get different results for various combinations of each.