Measuring Disk IO on Linux using dd

Ever since we have been developing our Twitter analytics system SensePlace2, we have encountered numerous IO issues when reading/writing to PostGRESQL database backed. We started out with the database on a RAID 10 on 4×7.2K 1TB Seagate Constellation drives which was essentially not the best option but we were constrained by our budget. This year we have upgraded to a mix of 8x600GB 15K Seagate Cheetah on which we have initialized 2 x RAID 10 arrays for data, 4x146GB 15K Seagate Cheetah as a single RAID 10 array for indexes and 6x146GB 15K 3Gbps Fujistsu Cheetah as a single RAID 10 array for the PostGRESQL WAL and our lucene indexes (All drives our 6Gbps SAS unless specified). Obviously, we noticed significant performance improvements (an order of 100 would be an understatement) due to which I decided to do a quick IO performance test. These tests were run on a dual hexa-core Xeon X5650 @ 2.67GHz on a Dell 810 chassis with a Dell PERC 700 RAID card. In addition, a Dell Powervault MD1200 is connected to the 810 chassis by a PERC 800 RAID adapter. I use the linux command dd for all my tests below.

To measure sequential write performance, the following command works fine.

dd if=/dev/zero of=/r6/outfile count=512 bs=1024k

This command will measure sequential read rates.

dd if=/r6/outfile of=/dev/null bs=4096k

IO Read Write performance

The figure above illustrates the IO read/write performance on all disk. The 4×7.2K RAID 10 array has the lowest performance. The write rates are extremely poor (about 400MB/s). The 15K cheetahs performed exceptionally in all cases (>1GB/s write and 5GB/s read). The only outlier was the 6x15K Fujistsu RAID 10 array, however, these drives were scavenged from one of our older servers and has a 3Gbps SAS adapter due to which performance was slightly lower than the other 15K drives. These tests are sequential read/writes only and I agree that random read/write test would make sense, however, based on these numbers the 15K drives will blow away the 7.2K drives in random IO tests (I am sure since the random read latency of the 15K drives are half of the 7.2K drives). Thus, buying hard drives is therefore a difficult tradeoff between storage size, IO performance required, cost (both of Hard drives and RAID hardware).