Performance Starved Applications

In my role here at Xiotech I get to spend a lot of time with prospects as well as customers. It’s probably why I love my job so much. My wife seems to think it has more to do with my A.D.D. (Attention Deficit Disorder) and the ability to change topics and discussions on a daily basis then it does anything else. She’s probably right; at least she likes to point out just how right she typically is!! But that’s probably a separate blog posting 🙂 So back to my role and the blog topic at hand.

Have you heard of the phrase “Performance Starved Applications” or PSA? Hopefully this term is not new to you. PSAs are simply applications that are not performing correctly due in large part to two things. First, the customer purchased a solution that the vendor failed to size correctly; or if it was sized properly, the customer failed to understand the caveats once it was deployed. What sort of caveats might this be? In typical SBOD (Switched Bunch of Disks) arrays, as the storage system fills up, the performance drops like a hammer. This isn’t just a one vendor issue. Have you noticed that a lot of storage vendors use the same parts to build their arrays? For example, place a Clariion, EVA, FAS, Storage Center, Magnitude 3D next to each other. About the only difference is the logo and the software loaded on the controllers. In most cases they are using the exact same drive bays, shuttles and drives. So, they all suffer from the same issue and they all have a similar caveat when it comes to just how full you can load up their system. The industry has settled on about 75%.

So hopefully we all agree, as the storage system fills up, the performance goes down at just about the same interval. To give you an example, a typical enterprise-quality Seagate drive can produce about 300 “typical” IOPS empty. At 50% full, it drops to about 50% of the performance or 150 IOPS, at 75% its closer to 119 IOPS. So, if we can agree on these numbers the rest should make sense. Let’s play with some numbers. So a customer needs 20TB’s of 15K enterprise drives and let’s say we use those new spiffy 300GB 15k Enterprise Seagate drives. Using my “South Texas” math that’s 20TB/300GB drives = 67 Drives. (before you say it, yes I know a 300 GB drive is NOT 300GB but that’s another blog topic all together – not to mention all the sparing needed for that number of spindles, RAID overhead, etc.) so if we take 67 drives times 300 IOPS we get 20,100 IOPS. SWEET. But let’s be realistic, they have no intention of keeping their storage solution empty. So, should we use 50%? That drops the number down to 10,050 IOPS. Not bad, but let’s be really serious here, NO WAY does a customer only use 50% of their capacity right? Most of my customers use 90% of their capacity but we’ll settle in on the industry standard 75% so that IOP Pool would be around 8000 (just in case you were curious on the 90% number it’s 105 IOPS per drive so that would be 7,000). So I put this into a nice graph because I’m a big fan of pictures !!

<Click to make larger>

So, that’s eye opening.

You essentially paid for 20,100 IOPS but you only get to use 8000. Now think of some of your transactional applications you are running like E-mail and databases and ask yourself “How are they running?” It wouldn’t surprise me if you start to remember a user here or there mentioning that at times their email just sits and waits or database reports that you need take hours to run. Those are what we like to call PSA. And truthfully, most of the time the storage array is the last thing people think of.

So, the next time you think your applications are running slow or your end users complain of performance problems, you might want to check and see just how full your array is. If you enjoy using 100% of both the performance and capacity you purchased then give Xiotech a call !!!

hey tommy tee,
one minor point about PSA – there are only 3 ways to solve this challenge today (excluding ISE):

1) More RAM
2) More Spindles
3) FTEs and highly-paid DBAs

it’s been my experience that it’s not enough to talk about PSAs to people who are in the data center – they may not understand there is a problem or even if they do, they don’t have the overall responsibility (or the budget) to fix it. I try to map the PSA challenge back to the business – and the business owner. and that means calling higher and connecting the dots back to the business issues – not the technology issues.

and you’ll find that the folks in the data center are not offended when you are a trusted advisor, as well…wdyt?

Tommy,
I don’t get it !!!
All this talk about how you can deliver IOPS better with high space utilization and then you point us to the SPC-1 benchmark where you have a utilization of less than 40% (you use mirroring).
Here the unit delivers approx average of 300 “typical” IOPS per drive. This is not the same as the drives delivering 300 IOPS each, as anybody who has used an array understands the effects of caching.
So in effect while you have a lot to say on reliability, some of which is new, there seems nothing new from a performance aspect, other than an ingenious new spin marketing comparing your device to an individual drive instead of against competitive arrays.

Thanks for responding to the blog. You’re right, we test using mirroring for data protection, as does nearly everyone else. That’s common. We have also published 348.1 IOPS/disk, not 300. These IOPS are indeed disk IOPS, not cache; there is no correlation (Pearson, R nearly zero) between cache size and IOPS on SPC-1. We get those 348 IOPS/disk by using very intelligent head-level control (RAGS) and opportunistic algorithms in the ISE. 348 IOPS/disk is an extremely good figure. As for the Eagle RP disks, 600 GB each, we are $9.57/IOP @ 363 watts consumed, the only SPC-1/E measurement in the industry so far. No one else has passed the test. They are more expensive per GB at acquisition than other drives, true, but they are worth it and then some over 5 years of operation – at $0 for hardware warranty in the ISE. Also, no one else has dared measure 600 GB drives in SPC-1 (let alone /E); if they did, you would see similar pricing from other vendors.

I’ll add a little more commentary to Rob’s response (thanks for the comment Rob !! )

Hopefully we can clear up some confusion. Utilization rates are basically the same things as “useable capacity” once things like RAID are taken into consideration and we simply have the BEST in the industry. In fact, we do all of our performance benchmarking at 96% utilization rate. Others do things like “Short-Stroking” the drives which mean they only use the outer most part of the platters (the fastest part ) to read/write their data. Their utilization rates at that point would be in the 30% range. This means if you want that type of performance, you will only get to use 30% of the capacity you purchase (after RAID). Make sense? If not – go check out my Cost Per TB blog post – I go into more detail in it.

From an IOPS point of view, SPC-1 numbers are NOT to cache. That’s why SPC-1 created their test. Remember, cache really only helps solve response time issues, yes you can do some “creative” stuff to get unbelievable IOPS numbers (1 Billion IOPS etc) but as well all know, it’s all about the SPINDLES !! If you want to understand a little more about how we can do, what we can do then check out this blog I posted on our ISE 101.

As far as cost per disk – TELL ME ABOUT IT !! BUT, in our defense, those were brand-new drives to the market and that usually demands a premium. Not to mention, we tend to be a little conservative on our published pricing. Trust me – that number has gone down and we will be publishing new SPC-1 and SPC-2 numbers shortly. I’m pretty sure we will be adjusting the “Executive Summary Quote” on that system which should drop the cost per spindle down a bunch. I’ll put another bug in our sales op’s ears to get new pricing published.

Thanks again for commenting!!

@StorageTexan

Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.