an insider's perspective, technical tips n' tricks in the era of the IT Revolution

May 06, 2014

XtremIO just got, well more Xtreme – and we’re throwing down the gauntlet.

The world’s most best AFA, with the most consistent performance envelope, the only dependable inline dedupe – and only truly scale-out design just got better :-)

Free snapshots. no performance impact, no limits. The OLTP use cases with XtremIO just got more Xtreme.

AIX support!

ViPR Controller support is coming shortly.

Here’s a demonstration of one of the use cases, OLTP snapshot for accelerating test and dev cycles. Note how simple it is to do everything, and how the performance of the system is so linear! Also note how the benefit of having a scale-out architecture means the whole workload just keeps singing as it tripled…

It’s very notable what this demo shows:

Snapshots work easily.

Snapshots have no performance impact (same metadata advantage used for inline dedupe).

As customer have workloads that “strain” a single X-Brick, when they have multiple test/dev instances, all being hammered – the fact that XtremIO scales out (truly unique in AFA land – scaling out front end, metadata all distributed and in memory, and also a distributed persistence pool) means that total cluster load (you might be ok with 80K IOps with a non-scale out solution, but as you use rich test/dev cycles – if the load exceeds that of a “non-scale out” approach, you are kinda hosed).

This area (the AFA battleground) is one of the hardest ones to maintain my “never go negative on the other guy” mantra (as our competitors – even their CEOs – slag us publicly, mostly erroneously – and not a little, but a lot). I’m going to keep trying, but well – it’s starting to get to “drop the gauntlet” point. I’m sure others feel the same about us – so maybe it’s quid pro quo.

Customers are LOVING XtremIO, and it has rapidly become the market share leader in the AFA space. Customers use it– for all sorts of use cases: VDI, Virtual Servers, OLTP – and more!

Don’t let anyone tell you otherwise (and as we vaulted into the lead, people started to really come out swinging). XtremIO has the full support of EMC behind it, and as you can see, the roadmap is strong and accelerating. Is there room for improvement? Yes – there always is!

If you want to know where we ARE having issues, or want to see my personal view on the “inline” vs “not inline” debate, most recent back and forth in the interwebs and the $1M guarantee read on…

Our biggest challenge with XtremIO right now is manufacturing fast enough (and we’re ramping like mad), and in some cases, that we as the EMC field don’t actively cannibalize workloads that are on a different EMC platform (see my plea to customers and the field here). There’s a tendency to sometimes be “conservative” (sometimes with the customer interest in one’s heart). There’s a tendency to find reasons to “play it safe” (“the customer has built these scripts/tools”, “the customer really like feature X on what they have now”).

Beyond linking to my longer argument at that link, here’s the nutshell of the argument:

In general, if you find yourself telling your customer why they SHOULDN’T be moving OLTP workloads that need predictably low write latency to an AFA, or VDI at scale shouldn’t be on an AFA, or Virtual Server infrastructure that has a lot of commonality shouldn’t be on an AFA… Well, you might be doing your customer a disservice. And if you need an AFA – XtremIO is very, VERY compelling.

BTW – there are three “big bucket” things still to do with XtremIO (Compression, native in-array remote replication, dyanmic scale out):

When it comes to compression, I would argue (just like with the early NetApp days of dedupe never being expressed in absolute utilization, but as dedupe ratios) – efficiency is not about ANY given feature, but rather your platform overall efficiency (space/power/cooling, and ultimately cost). Competitors with 20-40% lower utilization than XtremIO, but who offer compression – I wouldn’t be too proud of that. And if you have a platform where performance goes partially or worse REALLY non-linear as you approach upper limits, well – I wouldn’t trumpet “efficiency” as a strong point.

When it comes to remote data protection, XtremIO customers that want the ultimate in performance, dedupe, snapshots - but also disaster recovery (with compression, dedupe, consistency groups, scaling, CDP and more), and continuous availability just include VPLEX and Recoverpoint. It’s a great combination – and a great choice for many customers TODAY. Yes, you can expect native Recoverpoint splitter integration (and integrated management) in the near term for customers that don’t also want the continuous availability and NEVER have a disruption that VPLEX offers. And, yes, of course you can expect to see a native recoverpoint splitter and perhaps more….

When it comes to dynamic scale out… Ahem, that’s coming from people with NO scale out (and Scale out matters – see the snapshot example above). People talk about “adding” Scale Out. That is something I’ve never seen done right – it’s really hard to “add”. If anyone claims that “federated management” is scale-out, I would push back. It’s not that Federated management isn’t good (we do it with VNX!) – it’s a stretch to call it scale-out. The true litmus test to me is that IO can come in from anywhere, and the data itself is distributed.

The latest battle ground is “dedupe” implementations. We’re getting into pissing matches over “what is ‘inline’”. My definition of inline data reduction is you dedupe completely, always, BEFORE you commit the IO to persistence. If PART of your data reduction (if one has “multipart data reduction”) is inline always, but the “meatier” parts happen during ANY post process (if it happens often, or not often, or variable of whatever) – you wrote the IO to flash. You will need to come around later (1s, 10s, 1m, 1h, 1day, whatever) and remove it. Name of the game with flash is avoid any write you can (write amplification). If you write it and come back to it later (particularly if the dedupe or compression process to come back to it later is a function – IN ANY WAY – of capacity, system load, free space, other system operations) that’s not inline, and it’s not predictable. It’s arguably a “good” post process (again – analogy – NetApp post process that got automated – is that inline? Most customers say “no”). Ultimately, customers choose. The customers I know like that XtremIO dedupe is always inline – meaning BEFORE ANY IO GETS WRITTEN TO NAND.

Sidebar: One other attack vector from startups has been to say 2014 is about the “All Flash Datacenter” – and anyone who offers ANY magnetic media as “not getting it”. This is an example of the “power simple (buy ignorant) arguments” I talk about here. Faceplam. If I was an analyst or the press, I would feel bad about repeating this – as it’s nonsense.

EMC (and I) firmly believes in “Flash Everywhere”, just not the “All Flash Datacenter” (in 2014), and we even believe in “All Flash” TODAY (in the form of AFAs and all-flash pools) for write latency-gated workloads.

But until flash closes the 10x $/GB economic gap for capacity-centric workloads – everyone will have a mix. Dedupe and Compression argument are red-herrings. To varying degress you can apply dedupe/compression on different media.

Furthermore, you can count on the fact that we have all sorts of crazy things (3D NAND Cubes, >1 Write Per Day/WPD for write-once use cases, post-NAND successors like phase change and carbon nanotubes) in the labs that might over time (2017-ish likely) make magnetic media irrelevant (if there aren’t innovations in magnetic media), and we will embrace them like crazy.

But, at this time frame, can everyone just “faceplam” everytime someone says “All Flash Datacenter?” It’s a sign of a narrow-minded worldview – and maybe the person saying it is just repeating a buzz-word or a “talking point”. This is why customers need AFAs, Hybrids, and also capacity-geared persistence layers.

Oh – and of course, since we’re here in Vegas with some real friends (customers, partners), and some “ahem” friends trying to crash a party, why not have a little fun with it :-)

How about this:

We’ve made a heavy point on our data services like inline data dedupe being something you can COUNT ON – and that if you can’t (like with everyone else), you really need to think about economics differently, and you need to think about performance linearity (which along with simplicity is why you want an AFA). So – how could we put our money where our mouth is? How about this?

Likewise we’ve made a heavy point about how with other AFAs, their performance varies over time, with capacity use. To us this is important. It’s all about the software architecture – and is something fundamental (you can’t bolt it on). We are willing to make this point more strongly as well. The IDC paper on AFA testing is the best example we’ve seen (open to others if people know other good 3rd party examples). You can get that IDC whitepaper by clicking on it (click on right)

Of course – even a good testing framework isn’t a “test harness”, and as the IDC framework points out, quick and dirty testing often doesn’t reflect what a customer will find a few months in, data loaded, data services on. Performance testing isn’t easy…

… So we’ve publicly shared a AFA PoC toolkit (thank you Miroslav!) with great detailed HOWTO videos (also on YouTube), VMs that are there for workload generation and analyzing the data we provide. (click on left). This isn’t just for XtremIO, but we think follows the IDC framework – and is generally applicable.

This is all open, and transparent – which means surely we’ll be attacked furiously by competitors, but hey – that’s what you get when you’re in front (and people want to play the “david vs. goliath” game). If another AFA vendor suggests this is a “bad” or “loaded” test harness, that’s fine – do whatever you like, but follow the IDC framework. I’d encourage you to post your recommended CHANGES to our harness, and we will add it. But if someone claims that we shouldn’t pre-condition, we’ll respectfully disagree. If someone claims that the workload shouldn’t vary (in block sizes, in read/write mixes), we’ll respectfully disagree. Most vendors want to do a “quick and dirty” test that will have them shine and just close your business.

Frankly, as we’ve started to become louder about this topic, we’ve seen some vendors dramatically decrease their usable capacity in response (depending on architectures - more free space can dramatically lessens the impact of system processes – including but not limited to system garbage collection).

If I was a customer, and all of a sudden had only 75% of what I had originally paid for – and in some case approaching about 50-60% useable – well, I might be a little PO’ed (particularly since we’re talking about an AFA – which aren’t cheap – even when you factor in data reduction techniques like thin/dedupe/compression)

If you are someone lets just say “less than ecstatically happy” with your current AFA which isn’t working quite as well as when you first got it or were promised that it would, we have kicked in a program for you:

This “AFA Rescue Program” helps the EMC field and partners respond when a customer is “less than happy” with their existing AFA.

Perhaps they didn’t do a comprehensive PoC (just something quick and dirty).

Perhaps the team that sold it to them didn’t deeply explain some of these behaviors that manifest over time with some array software designs.

Perhaps their workloads grew and they need more than the “scale up” or “partial scale out” architectures can support in all other AFA platforms out there.

Perhaps they acquired an AFA which doesn’t have much in the way of data services – and they are discovering they need rich snapshots and replication (via VPLEX and Recoverpoint), VAAI support, vCenter Integration, northbound APIs, inline dedupe they can count on, whatever!

Whatever the reason, the EMC field and our partners are enabled to now rescue you, and offer a more than fair value on your existing asset.

XtremIO is a great platform, and has huge customer response. It also has (perhaps along with ViPR and Isilon) THE MOST aggressive near term roadmap in EMC, with feature after feature coming – we are doubling down! It’s a great time to be an XtremIO customer.

Are you an XtremIO customer? I’d love to hear the good/bad/ugly! Are you a customer of another AFA? Do you see some of the things we point out, or are we just mental? Are you a partner with experience with XtremIO (and would be great if you have experience with other AFAs) interested in commenting?

Comments

You can follow this conversation by subscribing to the comment feed for this post.

Chad, I'm having some difficulty with the "truly scale out" comment, seeing as it can only have 4 bricks, and it isn't dynamically expandable. I think your to-do item of dynamic scaling speaks to the lack of online expansion, but did I miss new announcements to the number of bricks possible?

Hi Chad, once again a mind-blowing post. Thanks for that. Where did you find the time for this again? I highly appreciate your thoughts and comments on the "All-Flash-Array-Discussions" and can copy that it's incredibly hard to stay with the "mantra" ;-). Especially when you are discussing predictable performance over a longer productive run-time while having all enterprise features up and running. Looking forward to your next blogs. Too bad I couldn't make it to the EMC World this year. CU soon, Michael

Speaking from experience with XIO customers love this technology! thanks for the update each day from World. I think the previous comment on truly scale out is irrelevant the logical capacity served from 4 bricks is huge!

Chad, this "scale out" claim won't go away until EMC provides guidance on how XtremIO can (or will) be scaled non-disruptively. Now, none of us would disagree that the architecture is scalable; the design allows for a many-node architecture. However what customers want is for scale out to mean more than just a few fixed configurations but the ability to scale a production system to cater for demand and growth in the field. At the moment, VMAX is more scalable than XtremIO.

(Name and email address are required. Email address will not be displayed with the comment.)

Name is required to post a comment

Please enter a valid email address

Invalid URL

Please enable JavaScript if you would like to comment on this blog.

Disclaimer

The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Dell Technologies and does not necessarily reflect the views and opinions of Dell Technologies or any part of Dell Technologies. This is my blog, it is not an Dell Technologies blog.