[EN] Dispelling Myths about IOPs

IOPs alone are not relevant as a measure of performance unless it is measured with a defined workload, and is correlated with response time and cost. The best way to measure relative performance is with a comparison between products where the configurations are fully disclosed, run against a common work load that represents a real production environment, and the results are certified by a third party. That is what we have with the Storage Performance Council SPC-1 tests. Buyers should require SPC-1 results from their vendors to verify their performance claims.

MYTH #1: IOPS ARE NOT RELEVANT

MYTH #2: LOCAL STORAGE IS ALWAYS FASTER THAN SHARED STORAGE

MYTH #3: HYPER CONVERGED SCALE OUT SYSTEMS CAN MATCH THE PERFORMANCE OF SCALE UP SYSTEMS

All flash storage performance testing

Only by controlling for garbage collection, using a standard, data reducible data load and returning to a cache friendly (or at least write cache friendly) workload we will truly understand all flash storage performance.

There are some serious problems with measuring IO performance of all flash arrays with what we use on disk storage systems. Mostly, these are due to the inherent differences between current flash- and disk-based storage.

NAND garbage collection

First off, garbage collection is required by any SSD or NAND storage to be able to write data. Garbage collection coalesces free space by moving non-modified data to new pages/blocks and freeing up the space held by old, no-longer current data.

The problem is NAND garbage collection takes place only after a suitable amount of write activity and measuring all-flash array storage system performance without taking into account garbage collection is misleading at best and dishonest at worse.

The only way to control for garbage collection is to write lots of data to a all-flash storage system and measure its performance over a protracted period of time. How long this takes is dependent on the amount of storage in an all flash array but filling it up to 75% of its capacity and then measuring IO performance as you fill up another 10-15% of its capacity with new data should suffice. Of course this would all have to be done consecutively, without any time off between runs (which would allow garbage collection to sneak in).

What we need is a standard corpus of reducible data for an IO workload. Such data would need to be able to be data compressed and data deduplicated. Unclear where such a data corpus could be found but one is needed to properly measure all flash system performance. What would help is some real world data reduction statistics, from a large number of customer installations that could help identify what real-world dedup and compression ratios look like. Then we could use these statistics to construct a suitable data load that can then be scaled and tailored to required performance needs.

Perhaps SNIA or maybe a big (government) customer could support the creation of this data corpus that can be used for “standard” performance testing. With real world statistics and a suitable data corpus, standard IO benchmarks could control for data reduction on flash arrays and better measure system performance.

Block IO differences

Third, block heat maps (access patterns) need to become much more realistic. For disk based systems it was important to randomize IO stream to minimize the advantage of DRAM caching. But with all flash storage arrays, cache is less useful and because flash can’t be rewritten in place, having IO occur to the same block (especially overwrites) causes NAND page fragmentation and more NAND write overhead.

Hitachi VSP G1000: The fastest all flash array validated by SPC-1

by Hu Yoshida on Feb 19, 2015

The System Performance Council SPC-1 tests results for Hitachi VSP G1000 have just been published on the SPC website and the results show that the G1000 is the clear leader in storage performance against the leading all flash storage arrays! VSP G1000 can be configured as an all-flash storage array simply by not adding any spinning disks.

The chart above is constructed from data on the SPC-1 website which shows the audited results of the SPC -1 test that was run on different all-flash storage systems. This test consists of a single workload designed to demonstrate the performance of a storage subsystem while performing the typical functions of business critical applications. Those applications are characterized by predominately random I/O operations and require both queries as well as update operations. Examples of these types of applications include OLTP, database operations, and mail server implementations. The chart above compares the performance results of several all-flash arrays with the VSP G1000 with only 64 Hitachi Flash Module devices.

The results show that the G1000 is the clear performance leader with less than 1ms response time at over 2 million IOPs! The G1000 had 60 percent greater throughput and one-third the full-load response time than the previous performance leader that used a specialized, niche DRAM-based systems.

This table comes from the SPC-1 report and shows the response time and throughput at different percent of benchmark load with different Application Storage Unit (ASU) Capacity. To net it out, at 100% load, using the total ASU of 64 Flash Module Devices, VSP G1000 produced a throughput of 2,004,941.89 IOPs with an average response time of 0.96ms.

The G1000 is not only an all-flash array it is also a full function enterprise storage system, the first enterprise storage system to achieve US$1.00/SPC-1 IOPS. Since it is an enterprise array, the G1000 also has the capability to dynamically provision and automatically tier data across internal and external pools of lower cost, higher capacity disk drives. While the G1000 provides additional functionality, scalability and cost savings over all-flash arrays, this capability disqualifies the G1000 as an all-flash array according to Gartner. Gartner defines an all-flash or solid-state array as one that cannot attach any disk drives. As a result, you will not find the G1000 in the Gartner Magic quadrant for solid-state arrays even though it has, by far, the highest performance of any solid-state or all flash array. Customers who rely on the Gartner Magic Quadrant may not realize the all-flash benefits of VSP G1000 and HUS VM.

If you are looking for the best all flash storage system, checkout the SPC-1 report on the SPC-1 website first since performance is the primary reason for an all-flash array. Then checkout the Gartner Critical Capabilities for General-Purpose High-End Storage Arrays Or if you are into Magic Quadrants you can check out Gartner’s Magic Quadrant for General-Purpose Disk Arrays and see where the leader in performance ranks in terms of Gartner’s other comparative reports. The G1000 ranks at the top for Gartner’s critical capabilities and HDS ranks near the top in the Magic Quadrant. Unfortunately the leader in this magic quadrant does not participate in the SPC-1 evaluation so we have no way to compare their performance or validate their performance claims.

You can find the SCP-1 results for the VSP G1000 here and the SPC-1 results for the HUS VM here.