November 8, 2011

Mark managed to dump some packets down our 100G link this morning using a test set. We didn't observe any packet loss. He was sending at 19 Gbit/s.

'

Files produced via dd and /dev/zero has some interesting properties when stored on the SSDs. It would appear that writing zero to a ssd gives an artificially good results. We haven't discovered the cause of this, but it definitely has in impact. We switched to using files produced by /dev/urandom our disk to disk performance has deteriorated significantly:

In an effort to get this back to something more reasonable we pulled two drives out of scdemo08 and placed them in scdemo06 and scdemo07. We improved performance by about 12 percent. (We might be able to add a few more drives).

We did some more investigation and discovered that we get different results doing pure read of all zero files vs random files. For example:

November 6, 2011

Figure 1: This shows preliminary results from testing from Sunday morning Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.

Now attempting to improve disk performance be removing LVM and and changing the Raid controller to write through. Also note that scdemo00 and scdemo01 had their raid stripe set at 64 kb.

Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.