Sponsor

This article is written to address sometimes incorrect usage of the “dd” program to measure disk write performance on a VPS by some visitors of the lowendbox.com website, and is originally based on this question and my answer to it.

Initially published on 2010-11-29, and still really useful

Q: What is the difference between the following?

dd bs=1M count=128 if=/dev/zero of=test

dd bs=1M count=128 if=/dev/zero of=test; sync

dd bs=1M count=128 if=/dev/zero of=test conv=fdatasync

dd bs=1M count=128 if=/dev/zero of=test oflag=dsync

A: The difference is in handling of the write cache in RAM:

dd bs=1M count=128 if=/dev/zero of=test
The default behaviour of dd is to not “sync” (i.e. not ask the OS to completely write the data to disk before dd exiting). The above command will just commit your 128 MB of data into a RAM buffer (write cache) – this will be really fast and it will show you the hugely inflated benchmark result right away. However, the server in the background is still busy, continuing to write out data from the RAM cache to disk.

dd bs=1M count=128 if=/dev/zero of=test; sync
Absolutely identical to the previous case, as anyone who understands how *nix shell works should surely know that adding a ; sync does not affect the operation of previous command in any way, because it is executed independently, after the first command completes. So your (wrong) MB/sec value is already printed on screen while that sync is only preparing to be executed.

dd bs=1M count=128 if=/dev/zero of=test conv=fdatasync
This tells dd to require a complete “sync” once, right before it exits. So it commits the whole 128 MB of data, then tells the operating system: “OK, now ensure this is completely on disk”, only then measures the total time it took to do all that and calculates the benchmark result.

dd bs=1M count=128 if=/dev/zero of=test oflag=dsync
Here dd will ask for completely synchronous output to disk, i.e. ensure that its write requests don’t even return until the submitted data is on disk. In the above example, this will mean sync’ing once per megabyte, or 128 times in total. It would probably be the slowest mode, as the write cache is basically unused at all in this case.

Which one do you suggest?

dd bs=1M count=128 if=/dev/zero of=test conv=fdatasync

This behaviour is perhaps the closest to the way real-world tasks behave. If your server or VPS is really fast and the above test completes in a second or less, try increasing the count= number to 1024 or so, to get a more accurate averaged result.

dd will require either sudo or root to run. Either way, once given permission to run, dd has no qualms about doing whatever the hell it’s told to do. Including erasing your MBR/Partition tables. dd will also (politely) not tell you that you are being dumb. So, unless the user is extremely careful and knows exactly what they’re doing, dd is extremely dangerous. I’d say a big blinking warning is more than deserved!

Also, regarding sudo verse root:
root is exactly what you think; complete access and control. sudo is a program that grants specific users root-level rights under certain somewhat broad conditions. The thing is, typically, sudo is set up to just allow all root commands. Making the distinction moot. Ubuntu does require explicit paths for certain utilities when used with sudo, but this is about all the “obfuscating” they do AFAIK.