I recently submitted a manuscript to the EMC XtremIO Business Unit covering some compelling lab results from testing I concluded earlier this year. I hope you’ll find the paper interesting.

There is a link to the full paper at the bottom of this block post. I’ve pasted the executive summary here:

Executive Summary

Physical I/O patterns generated by Oracle Database workloads are well understood. The predictable nature of these I/O characteristics have historically enabled platform vendors to implement widely varying I/O acceleration technologies including prefetching, coalescing transfers, tiering, caching and even I/O elimination. However, the key presumption central to all of these acceleration technologies is that there is an identifiable active data set. While it is true that Oracle Database workloads generally settle on an active data set, the active data set for a workload is seldom static—it tends to move based on easily understood factors such as data aging or business workflow (e.g., “month-end processing”) and even the data source itself. Identifying the current active data set and keeping up with movement of the active data set is complex and time consuming due to variability in workloads, workload types, and number of workloads. Storage administrators constantly chase the performance hotspots caused by the active dataset.

All-Flash Arrays (AFAs) can completely eliminate the need to identify the active dataset because of the ability of flash to service any part of a larger data set equally. But not all AFAs are created equal.

Even though numerous AFAs have come to market, obtaining the best performance required by databases is challenging. The challenge isn’t just limited to performance. Modern storage arrays offer a wide variety of features such as deduplication, snapshots, clones, thin provisioning, and replication. These features are built on top of the underlying disk management engine, and are based on the same rules and limitations favoring sequential I/O. Simply substituting flash for hard drives won’t break these features, but neither will it enhance them.

EMC has developed a new class of enterprise data storage system, XtremIO flash array, which is based entirely on flash media. XtremIO’s approach was not simply to substitute flash in an existing storage controller design or software stack, but rather to engineer an entirely new array from the ground-up to unlock flash’s full performance potential and deliver array-based capabilities that are unprecedented in the context of current storage systems.

This paper will help the reader understand Oracle Database performance bottlenecks and how XtremIO AFAs can help address such bottlenecks with its unique capability to deal with constant variance in the I/O profile and load levels. We demonstrate that it takes a highly flash-optimized architecture to ensure the best Oracle Database user experience. Please read more: Link to full paper from emc.com.

Isn’t EMC sometimes close in market terms to Oracle? I hope not but many vendors call all flash array when it is really populated by flash cards plugged to PCIE. flash based SSD are also fast, but SCSI commands are adding latency. it is big different to have latency 100-150 us. against 300-600us.

In my opionion, EMC is only chasing other big players with all flash array. Violin, hp x7 (or hitachi of course) seem to be in a order of magnitude before xtreme IO.

I have to also say the sizing of ddram cache surprises me. I have seen configuration with 1TB cache. Well, this would be great for VMAX. however 1TB cache over expensive SSD drives,… am I missing something?

Well i would be quite interested what was the “competitor’s array”, however it is clear to me there is no change (well it not a secret for me that Exadata F40 cards can experience huge write performance degradation after several hours, but Exadata all flash array is not AFA, at least not for me)

I’d be really interested SLOB numbers from 100% read 8kB test, from xtremIO of course.And please, did you use HEAVY REDO STRESS or not ( I would prefer not to do it)?

For me even with dual X-brick 230 000 IOPS with service time about 1.5ms is extremely high number. The same for 100 000 IOPS for single brick with service time about 1ms.
I agree there can be some competitors which are worse, or which have significant issues with sustaining write, but when I check some numbers from guy named flashdba… and i have also tested several AFAs (with real flash not SSD) and the results are in order of magnitude better

@Pavol: Nice to discuss platform performance with a SLOB-knowledgeable reader. Thanks. The slob.conf settings are all spelled out in the paper.

I don’t doubt any of your observations about other vendors’ arrays. I say it all the time, and sometimes folks even quote me on it:

All platforms have strengths and weaknesses

I would feel weird saying that any technology is best for all use cases and workloads under any/all circumstances. In fact there is only a single vendor in IT that I know of pushing that sort of message and I don’t work for them. I have not tested every All-Flash Array so I can’t say I have numbers that suggest EMC XtremIO is better than all of them for every use case and workload under every condition. What I can say is that EMC XtremIO demonstrates extremely predictable low-latency (with high bandwidth) and the performance characteristics scale as the array is scaled up (it scales both capacity and performance in modular, incremental units).

In summary, I have no problem conceding there are other storage platforms that outperform XtremIO under certain conditions. All technology in the Enterprise IT space has a strong points…and, incidentally, Achilles Heel. I think it’s very important to have a platform that not only performs very well but does so predictably and does so while 100% of the features (such as compression, deduplication, writable snapshots/clones) are in use.

If I said any more it would sound like pure marketing so you probably wouldn’t believe it any way.

Kevin, I’m obviously missing something “he slob.conf settings are all spelled out in the paper.” Making some mistake or what, byt I’m not able to find any slob.conf in the paper 🙂 Active Data Set was initially 1TB, 96 processes, 25% of updates OK. But I have too many questions 🙂

1) what was the WORK_UNIT? It must have been much lowered as “default” of 256 (otherwise 100k IOPS for single brick, 83k read IOPS would’t mean 29 000 executions per second)
2) how many db_writers did you have for 1GB buffer cache? the default value depends on buffer cache size, but on number of logical threads in system too (aka oracle parameter CPU_COUNT).
3) was SMT disabled for E5 servers? From your AWR shortcuts, it seems you or other EMC engineers might have disabled it
4) I was also NOT able to find closer details of xteremeIO. How many SSD drives in which size were used? How many DDRAM cache? 🙂