The content of the "<file-lists-directory>" is one or more files containing a list of paths of the ATLAS files to be copied. The name of each file within this directory corresponds to the fully qualified host name of the destination host. i.e the list of ATLAS files contained within scdemo06.heprc.uvic.ca will be copied via FDT to scdemo06.

November 11, 2011

Test nodes consisted of scdemo06 and scdemo07. In all tests, scdemo06 was used as the client FDT node pushing 89 x 11gigabyte atlas root files contained on the /ssd filesystem to the /ssd filesystem on scdemo07 which was running the FDT server. Both client and server were running under non-priviled accounts, the scheduling algorithm was set to "noop", and the /ssd filesystems on both systems were hosted by XFS filesystems on hardware RAID0 of six OCZ Deneva2 drive, stripe size of 1MB, and write through cache.

The first test following PDClear was teminated prematurely because the performance was atrocious (Avg: 3.575 Gb/s after 07.10%). The server had been running as root and the default scheduling algorithm had been in effect.

Corrected the scheduling algorithm and the uid of the server and ran the second test to completion. Result was poor, Avg: 4.655 Gb/s after 100.00%. However, the start had been very good (Avg: 6.539 Gb/s after 01.68%).

A third test was conducted to see if a previously used disk (after the PDClear) performed better. Result gave previously expected level of performance: Avg: 5.661 Gb/s after 100.00%.

November 8, 2011

Mark managed to dump some packets down our 100G link this morning using a test set. We didn't observe any packet loss. He was sending at 19 Gbit/s.

'

Files produced via dd and /dev/zero has some interesting properties when stored on the SSDs. It would appear that writing zero to a ssd gives an artificially good results. We haven't discovered the cause of this, but it definitely has in impact. We switched to using files produced by /dev/urandom our disk to disk performance has deteriorated significantly:

In an effort to get this back to something more reasonable we pulled two drives out of scdemo08 and placed them in scdemo06 and scdemo07. We improved performance by about 12 percent. (We might be able to add a few more drives).

We did some more investigation and discovered that we get different results doing pure read of all zero files vs random files. For example:

November 6, 2011

Figure 1: This shows preliminary results from testing from Sunday morning Nov 6. The summary is that simultaneous iperf is good from all hosts (9.9+ Gbps) . Reading from disk and writing memory is reasonable with performance of 7.5 -8.5 Gbit/s, but I think could do with some improvement. Disk to disk performance is only 5.0 Gbps and not really adequate for the test. We are using 2000 atlas files of around 500MB on each node for the transfers. We had better performance with 10G files written with 'dd'. Graph pulled from cacti for the Brocade.

Now attempting to improve disk performance be removing LVM and and changing the Raid controller to write through. Also note that scdemo00 and scdemo01 had their raid stripe set at 64 kb.

Figure 2: After changing the raid configuration to be write through and using large 10G files created with 'dd' we see a much improved disk to disk throughput (as shown in the two FDT outputs immediately above). Strangely we see that one direction is nearly 0.8 Gbps faster then the other. I don't understand the reason for this yet.