Thick vs Thin Disks and All Flash Arrays

Which type of disk is better?

There has always been an ongoing debate on whether thin or thick disks are best suited for high performance IO workloads. While Eager Zeroed Thick disks provide the best performance, they occupy the entire space including unused space in the overlaid file system and are hence not efficient with space utilization. Thin disks consume only the space used by the overlaid operating system or application but have underlying performance concerns for high IO workloads.

Having requirements such as eager zeroed thick for high IO workloads made the job of both the virtual and storage admins difficult. There is extra effort and coordination required to deviate from the norm of using thin disks for traditional virtual machines and to separate out the LUNS with special formatting. These special requirements were operationally hard to implement and maintain.

Flash storage has revolutionized and improved its performance many fold. All flash storage arrays holds the promise of greatly improved performance. Can use of All Flash arrays obviate the need for different storage LUNS and formatting for high IO workloads?

Common types of VMDK Disk types:

NAME

DESCRIPTION

Eager Zeroed Thick

An eager zeroed thick disk has all space allocated and wiped clean of any previous contents on the physical media at creation time. Such disks may take longer time during creation compared to other disk formats. The entire disk space is reserved and unavailable for use by other virtual machines.

Thick or Lazy Zeroed Thick

A thick disk has all space allocated at creation time. This space may contain stale data on the physical media. Before writing to a new block a zero has to be written, increasing the IOPS on new blocks compare to Eager disks. The entire disk space is reserved and unavailable for use by other virtual machines.

Thin

Space required for thin-provisioned virtual disk is allocated and zeroed on demand as the space is used. Unused space is available for use by other virtual machines.

Setup:

The following components were used for testing VAAI Block Zero performance on an all Flash storage array:

The Tests:

The virtual machine used for testing was a Windows 2008 R2 based virtual machine with data disks of 500GB size and different types (Thin, Lazy Thick, Eager Thick) added and removed after each set of tests. IOMETER was used as the tool to profile the different kinds of disk types. The tests were done for 100% write workloads with 4K, 64K and 256K block sizes for 100% sequential and 100% random scenarios. The virtual disks were attached to Windows in raw mode and not formatted for the IOMETER testing. Tests were run for both 100% sequential and 100% random writes and tabulated as shown above.

Since the disks were deployed in raw mode and created independently for different tests, the penalty for first write is highlighted. The results would be better for random writes if the blocks had been written to earlier. So the tests highlight the worst case for thin and lazy zeroed thick drives.

Results:

The table below shows the results of the IOMETER tests.

Target Name

Write IOps

Write MBps

Average Response Time (ms)

4K

Thin

3105.31

12.13

0.32

Thin Random

421.65

1.65

2.37

Lazy Thick

3097.94

12.10

0.32

Lazy Thick Random

421.65

1.65

2.37

Eager Thick

3298.12

12.88

0.30

Eager Thick Random

3112.70

12.16

0.32

64K

Thin

1070.54

66.91

0.93

Thin Random

410.51

25.66

2.43

Lazy Thick

1088.20

68.01

0.92

Lazy Thick Random

408.46

25.53

2.45

Eager Thick

1211.65

75.73

0.82

Eager Thick Random

1141.34

71.33

0.87

256K

Thin

566.34

141.58

1.76

Thin Random

341.37

85.34

2.93

Lazy Thick

567.09

141.77

1.76

Lazy Thick Random

342.75

85.69

2.92

Eager Thick

648.77

162.19

1.54

Eager Thick Random

668.88

167.22

1.49

The latencies for all sequential workloads except for 256k workloads are less than a millisecond and the type of disk has minimal impact on the latencies. For random workloads there is latency impact on Lazy and Thin disks, while there is no impact on Eager Zeroed thick disks. For random workloads there was a marked difference in the throughput and latencies for Thin and Lazy Zeroed Thick disks.

All flash arrays like Pure Storage, store data in a compressed and optimized format. Thin or thick provisioning at the VMware level has minimal impact on what is actually used. If VAAI is enabled creating thick eager zeroed thick VMDK disks do not consume too much time. We were able to create an eager zeroed thick 500 GB disk in less than a minute and so provisioning time is not a concern. The following graphic shows the latencies in milliseconds for the different disks on a 64K block size.

The latencies for all tests stayed below 5ms. For random IO tests the higher latencies that were seen were due to additional IO delays at the OS layer and not at the storage array. The latencies at the storage array level were always below 1ms. The following graphic shows the write IOPS for the different disks on a 64K block size.

The following graphic shows the write MBPS for the different disks on a 64K block size.

When you look at the performance numbers, we see that Eager Zeroed thick disks perform the best on all tests. Thin and Lazy Zeroed disks also perform well for sequential workloads and stay within 1ms latencies.

Conclusion:

Based on the testing we can conclude that there is no significant difference in performance for sequential workloads between the different types of virtual disks. Most common IO workloads on all flash arrays will not be impacted by the type of underlying disk used as seen in the low latencies below 5ms for all workloads.

Some of the disadvantages of using Eager Zeroed thick disks are avoided by using All Flash arrays:

Time to provision on all flash arrays like PURE, by leveraging VAAI for disk creation is very quick and done in a few minutes even for large datastores.

Storage Utilization inefficiency is avoided as the amount of actual storage consumed is optimized for both thin and thick disks by the inbuilt compression capabilities of the storage platform.

For High IO workloads, we can conclude that using eager zeroed thick disks would provide the best performance even in all flash arrays for all types of write workloads. If the storage array has built in efficiencies to compress and store only unique data and is VAAI compliant, most of the disadvantages of eager zeroed provisioning are avoided.

Comments

Bowie

That’s what I was going to say. I could get that kind of performance with about dozen traditional SAS drives… WTF.

Then, I immediately began imagining what drug or combination of drugs the author was on when he wrote this article for not noticing, “Gee, i’m getting terrible throughput here.”

If you’re carving out and pushing I/O to a single LUN, you’re setting up an inherently bad design. Just stop and think about what you’re doing from a logical perspective. You have created the human equivalent of stadium full of people entering and exiting through one single turnstyle; with one LUN, you are effectively serializing your I/O. It is a massive, massive bottleneck. You need to distribute your I/O if you expect to fully leverage what your SSD array (or any array, for that matter) is offering.

I am constantly amazed by the number of people who don’t understand the concept of ensuring parallelism at the SAN layer…this is 101, folks. You cant just carve out a single giant LUN and say “DERP, ITS OKAY ITS TEH SSD”… Looking at the benchmark results, about the only thing it reveals is how terrible a job Windows is doing at write-combining at the device layer.

JZ

March 10th, 2015

Well said. These test results are ridiculous. This article does accidentally make an important point.

Spending a ton of money on a basic all flash array changes nothing in how that array needs to be integrated into your environment. If you’re frustrated wasting all that time managing a EMC or NetApp array, don’t expect that bringing in a flash array is going to make life much simpler. Managing storage for virtual workloads is complicated.

If you don’t know what you’re doing, you may get similar results to this blog post: Around 1,000 IOP’s between 1-3ms (at 64K)…instead of up to 150,000 IOP’s at <1ms like you'd expect looking at Pure's marketing material.

IOP's and throughput are only part of the problem. Bottlenecks inherent to LUN's need to be accounted for. Storage management is a pain in the ass. All the IOP's in the world arent going to change that.

I came to your Thick vs Thin Disks and All Flash Arrays | VMware vSphere Blog – VMware Blogs page and noticed you could have a lot more visitors. I have found that the key to running a popular website is making sure the visitors you are getting are interested in your niche. There is a company that you can get visitors from and they let you try the service for free. I managed to get over 300 targeted visitors to day to my site. Visit them here: http://pixz.nu/1LPS