Want 12TB of flash memory for your server rack? $331,000, please

IBM is investing $1 billion in R&D for new flash memory systems.

IBM said yesterday that it will spend $1 billion on the research and development of new flash memory systems to accompany its various servers, storage, and middleware products.

Of course, IBM is expecting a big return on that investment. IBM already purchased flash memory vendor Texas Memory Systems last October for an undisclosed sum. Along with the $1 billion R&D investment, IBM announced the availability of all-flash storage appliances based on Texas Memory technology.

The suggested price list for the new IBM FlashSystem appliances shows the big investment in solid state storage should be well worth it. The IBM FlashSystem 720 and FlashSystem 820, both 1U-sized rack-mounted systems, start at $16,000 each. A choice between four 8Gbps Fibre Channel or 40Gbps quadruple data rate InfiniBand interface ports adds another $8,000.

Then you actually buy the flash storage. FlashSystem 720 comes with 6TB or 12TB of SLC flash (5TB or 10TB of usable capacity) for $166,000 and $331,000, respectively. FlashSystem 820 comes with 12TB or 24TB (10TB or 20TB usable) of eMLC flash for $166,000 and $331,000, respectively.

SLC flash is well-suited to write-heavy workloads distributed across multiple servers, while eMLC flash is suitable for read-heavy workloads, according to Big Blue.

"FlashSystem 720 and FlashSystem 820 storage systems may deliver significant benefits to server-based infrastructures which depend on large quantities of locally attached flash devices, such as PCIe Flash cards or SAS/SATA SSDs," IBM said. "While locally attached flash devices typically work well for accelerating applications located on single servers, such devices can be more difficult to share across multiple servers, scale to large capacities, and centrally manage than FlashSystem 720 and FlashSystem 820 storage systems. Additionally, FlashSystem 720 and FlashSystem 820 storage systems have sophisticated reliability features like Variable Stripe RAID that are typically not present on locally attached Flash devices."

FlashSystem is at least the fourth flash storage product released by IBM, alongside the Storwize V7000, System Storage DS8870, and XIV Storage System.

IBM's expanded embrace of flash shouldn't be called revolutionary. Old storage industry stalwarts like EMC have been selling flash products into data centers for years, as have a variety of newer vendors like Fusion-io and Violin Memory.

But as it so often does, IBM will likely find a way to make flash storage a profitable part of its business. Already, Sprint Nextel has purchased 150TB of flash storage from IBM. And IBM has a good sales pitch, saying flash will improve data center reliability while cutting response times from milliseconds to microseconds.

"Flash systems can provide up to 90 percent reductions in transaction times for applications like banking, trading, and telecommunications; up to 85 percent reductions in batch processing times in applications like enterprise resource planning and business analytics; and up to 80 percent reductions of energy consumption in data center consolidations and cloud deployments," IBM said.

IBM did not say how many years the $1 billion R&D investment will be spread over. But as part of its sales pitch, it will open 12 "Centers of Competency" across the globe to show customers proof-of-concept scenarios for speeding up performance of workloads such as "credit card processing, stock exchange transactions, manufacturing and order processing systems."

I've recently bought a very reliable 512GB SSD (M4) for $350, in Canada. They are cheaper in U.S.

So I thought: 12TB is 24 my SSDs. Let's make 6 RAID5 configurations for extreme reliability and speed, 5 disks in each.

6 very high-end 5-disk RAID5 NASes, say $1500 each, equals $9000.

6 x 5 SSDs = 30 x $350 = $10500.

Total $20k for an extremely fast and extremely reliable system. That's 16 times less money.

Well now that you've let the cat out of the bag, IBM is doomed to failure. All of those enterprise, research, and military customers looking for high availability flash storage will just order a stack of SSDs and a cheap NAS enclosure from Newegg and call it a day instead of overpaying IBM. Thanks for ruining the American economy.

The main part of the price in hardware at this level isn't the actual components - it's the fact that you can call IBM and yell at them when it breaks and they'll send a technician out to repair it. The price of being able to CYA is quite high.

$31/GB seems a bit pricey, even for enterprise SLC. These drives are probably targeted at financial institutions that can afford the price and need the speed.

Of course, you could buy a 180TB Backblaze (45 drives x 4TB/each) for around $20k. Slap an infiniband controller, and you have something that can serve up large amounts of data (with a relatively poor iOPS).

As I understand it (probably not too well) SLC is the "best" NAND, but is it really worth $1000 per GB when a run-of-the-mill SSD is only $1/GB? Why not strap a bunch of 840 Pros together?

If you owned a major stock market exchange, or a massive country-wide utility, or a global shipping and logistics company, or a billion-dollar research institution, would you trust your data to a handful of consumer-grade hard drives?

Surprisingly, we find that 13 out of the 15 devices, including the supposedly “enterprise-class” devices, exhibit failure behavior contrary to our expectations. Every failed device lost some amount of data that we would have expected to survive the power fault. Even worse, two of the fifteen devices became massively corrupted, with one no longer registering on the SAS bus at all after 136 fault cycles, and another suffering one third of its blocks becoming inaccessible after merely 8 fault cycles.

Yup. The actual hardware is just a small part of the cost. You're paying for reliable, guaranteed technicians and 1-hour delivery of replacement parts / expert advice on a 24/7 basis for the entire lifetime of the contract. And high-quality software to make it all work properly. On time. Integrated with the rest of your massively complex software and hardware systems.

This is stuff which HAS to work. The amount of critical data going through a high-speed hardware layer like this, if anything goes wrong, it can seriously mess up your entire multi-national enterprise.

To be honest, I'd say it's cheap. $330,000 for 24TB of highly engineered Flash that integrates well with your $100 million computing/datacentre is bloody peanuts.

Surprisingly, we find that 13 out of the 15 devices, including the supposedly “enterprise-class” devices, exhibit failure behavior contrary to our expectations. Every failed device lost some amount of data that we would have expected to survive the power fault. Even worse, two of the fifteen devices became massively corrupted, with one no longer registering on the SAS bus at all after 136 fault cycles, and another suffering one third of its blocks becoming inaccessible after merely 8 fault cycles.

I think there's more work to be done in SSDs before serious financial and military sectors rely on them for critical data.

Organizations that really need the speed that SSDs provide most likely have a High Availability site with a redundant, real-time (or very near real-time), backup as well as an offsite disaster recovery copy (likely not on SSD) with an additional copy of their data. The reliability issues isn't as big and nasty as that makes it out to be.

Surprisingly, we find that 13 out of the 15 devices, including the supposedly “enterprise-class” devices, exhibit failure behavior contrary to our expectations. Every failed device lost some amount of data that we would have expected to survive the power fault. Even worse, two of the fifteen devices became massively corrupted, with one no longer registering on the SAS bus at all after 136 fault cycles, and another suffering one third of its blocks becoming inaccessible after merely 8 fault cycles.

I think there's more work to be done in SSDs before serious financial and military sectors rely on them for critical data.

I have neither the time nor energy required to read that document right now, but every bit of information I have read for the past several years indicates that SSDs are far more reliable than mechanical hard disks in enterprise environments, meaning they will not only be able to survive a real-world enterprise power fault, but that they'll also last longer than a mechanical equivalent. This, combined with their performance characteristics, heavily contributes to the amount of adoption we're seeing in the industry these days. If you're using an old SandForce based SSD, then let the fear of sudden data disappearance be close to your mind at all times, but other SSDs (such as from Intel, Crucial, and Samsung) are extremely reliable... and the fewer moving parts, the less likely any product is to have sudden failure. (a generalization, but something worth considering.)

EDIT: and I just fully read the section you quoted. I am honestly clueless as to who considers that to be valid research. Under what scenario would any enterprise class hardware be subjected to 136 fault cycles? Have they ever heard of a battery backup? a generator? Were they writing data to the SSD at the moment the power faulted? Contrary to popular belief, faulting the power on a storage medium while it is flushing the cache will result in a loss of data. But first and foremost, why didn't they test against mechanical hard disks in the exact same way, to provide a basis of comparison for their results? That is really shoddy research on their part, if this is the point of it. I hope for their sake that they were researching something else and this was just part of the data.

and the fewer moving parts, the less likely any product is to have sudden failure. (a generalization, but something worth considering.)

I've heard it described this way: When solid-state fails, it does so in a predictable manner rather than in the unexpected, warning-less way mechanic drives can fail. Simply from the lack of moving parts.

I don't know if that's strictly true, but it's certainly the perception a lot of people have.

Also, I hate to double post, but isn't EMC already offering configurations like these? I remember hearing talk at work about EMC offering solid-state solutions.

I'd be surprised if HP wasn't in the process of pushing their own solid-state or hybrid solution to the market too.

They already do and it's been available since at least December. All flash 3PAR array. Their Tier1/2 arrays are all going to 3PAR. They way it's been told is that 3PAR didn't need a re-work to handle flash and was one of the reasons HP wanted to buy 3PAR. End goal being to consolidate to that platform.

and the fewer moving parts, the less likely any product is to have sudden failure. (a generalization, but something worth considering.)

I've heard it described this way: When solid-state fails, it does so in a predictable manner rather than in the unexpected, warning-less way mechanic drives can fail. Simply from the lack of moving parts.

I don't know if that's strictly true, but it's certainly the perception a lot of people have.

That's the point of course; the research is showing this, though intuitive, is absolutely not the case. Hard disks have been failing and suffering power cuts for decades. We know how they work and how they fail. They have achieved predictability in their failure. We're still researching how SSDs behave when they fail.

Regarding a High Availability site, it is either synchronous, or asynchronous. If it's async, you've lost the recent data if the local copy loses it from a power outage. If it's synchronous, then your expensive flash setup suffers from a long latency anyway.

Were they writing data to the SSD at the moment the power faulted? Contrary to popular belief, faulting the power on a storage medium while it is flushing the cache will result in a loss of data.

While I find the methodology of the paper also quite strange, now that's just not true. Enterprise SSDs have large capacitors for a good reason.

elizibar wrote:

coder543 wrote:

and the fewer moving parts, the less likely any product is to have sudden failure. (a generalization, but something worth considering.)

I've heard it described this way: When solid-state fails, it does so in a predictable manner rather than in the unexpected, warning-less way mechanic drives can fail. Simply from the lack of moving parts.

I don't know if that's strictly true, but it's certainly the perception a lot of people have.

It would be true in a perfect world, in reality the biggest problem for SSDs isn't the flash, but the controller. That's getting better and better, but the #1 failure for SSDs is controller bugs, which have plagued pretty much everybody in the last few years. Even Intel with its huge QA resources had pretty bad problems (although their enterprise drives were exempt of all the firmware problems I think?).

If you assume a perfectly working firmware/controller, then yes if you run out of write cycles you're just left with read-only flash, which is rather nice for data recovery.

For those that suggest a stack of consumer grade SSD's would give you the same (or even remotely close to it) performance I'm going to point out that you just don't get it.

To start with, these are 1u servers. That's a thin little rack mounted server. Slap 20 SSD's together and you've got a 4u box pretty fast. And 20 commercial grade SSD's nackered together with some form of controller will likely draw more power to boot. And it'll result in nowhere near the performance levels of these units.

As many have also pointed out, for the associated price tag you also get world class service. You're not calling Peggy in Alaska (Edit: Apparently Peggy lives in Siberia and I just have Alaska on my brain.) when you call in for support. You're not even getting some semi-trained money reading from a script. This kind of product gets you a phone number that goes straight in to an engineer who knows his ass from his elbow even without a map.

Sure, you COULD buy 4 Ford Cobra Mustangs for $70k each but you still won't have a Bugatti Veyron Super Sport. In the same way, slapping together 20 SSDs or 40 SSDs will still not result in the performance, reliability or technical dedication you get with one of these boxes.

These are simply NOT devices the average consumer will understand... this is truly kit geared towards large, high demand server infrastructure clients. Guys that build hobby PC's for gaming, or who work in medium or even large corporate IT departments don't have any need for this stuff and likely don't understand what it's really for.

This is cool shit. But it's cool shit for a very select few customers who will pay these kinds of prices for their cool shit.

For those that suggest a stack of consumer grade SSD's would give you the same (or even remotely close to it) performance I'm going to point out that you just don't get it.

If you understand "performance" as "how fast the storage can read/write" then, that's pretty wrong. Enterprise SSDs are often slower than their counterparts due to more conservative controller designs, timings, longer update cycle, etc. Then putting several slower consumer drives into a RAID-0 will certainly improve performance over a enterprise drive.

The problem with all of that is that enterprises hardly care about performance firsthand and much more about reliability and all those speed advantages of consumer SSDs come at a huge reliability deficit. All the other points I wholeheartedly agree.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

IMFT doesn't give official numbers for their 20nm MLC flash apart from saying "no changes from the 25nm process", which was rated for 3000 P/E cycles.

In practice for consumers it really doesn't matter. A 160gb drive with 3k P/E cycles is good for almost 470 TB of writes and even including bad wear-leveling of 1.5, we end up with more than 300TB of writes.

Not sure what current SLC is rated at, but it used to be 100k P/E cycles for the 45nm flash iirc.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

IMFT doesn't give official numbers for their 20nm MLC flash apart from saying "no changes from the 25nm process", which was rated for 3000 P/E cycles.

In practice for consumers it really doesn't matter. A 160gb drive with 3k P/E cycles is good for almost 470 TB of writes and even including bad wear-leveling of 1.5, we end up with more than 300TB of writes.

Not sure what current SLC is rated at, but it used to be 100k P/E cycles for the 45nm flash iirc.

Yeah, SLC has very high P/E cycle limits, and 100k is the number I remember as well... don't really know how modern SLC flash compares.

IMFT doesn't give official numbers for their 20nm MLC flash apart from saying "no changes from the 25nm process", which was rated for 3000 P/E cycles.

In practice for consumers it really doesn't matter. A 160gb drive with 3k P/E cycles is good for almost 470 TB of writes and even including bad wear-leveling of 1.5, we end up with more than 300TB of writes.

Not sure what current SLC is rated at, but it used to be 100k P/E cycles for the 45nm flash iirc.

For those that suggest a stack of consumer grade SSD's would give you the same (or even remotely close to it) performance I'm going to point out that you just don't get it.

If you understand "performance" as "how fast the storage can read/write" then, that's pretty wrong. Enterprise SSDs are often slower than their counterparts due to more conservative controller designs, timings, longer update cycle, etc. Then putting several slower consumer drives into a RAID-0 will certainly improve performance over a enterprise drive.

The problem with all of that is that enterprises hardly care about performance firsthand and much more about reliability and all those speed advantages of consumer SSDs come at a huge reliability deficit. All the other points I wholeheartedly agree.

That's actually not very true. Performance can be measured in lots of ways including reliability over time, speed and throughput and ease of scalability.

As the article clearly points: out on the speed and throughput scale, these are much faster than most large scale enterprise storage arrays and would be insanely faster than 20 commercial SSD's slapped in a box together. That combined with the other means of rating "performance" and these are still a world away from consumer kit.

Suggesting a RAID 0 on consumer grade SSDs would even approach this is silly. You've now doubled the 20 SSDs to 40 and you still won't have the speed throughput, the reliability, the ease of scalability or the dedicated support.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

Also note that if these are anything like the previous generation of Texas Memory systems, they don't use SSD drives like consumer systems use. This is the next step of the RAMSAN - the same SAN that Eve Online uses.

They provide several 100k of 4k IOPS and provides several GB/s of continuous write speeds. These have more to do with the PCIe SSD cards than regular SSD drives. In fact these are essential PCIe SSD cards on crack. Check out the prices on PCIe SSD cards and you'll find they are WAAAY more expensive than even regular SLC SSDs.

While these seem "expensive" they are actually pretty cheap. Take a look at what an EMC SAN full of SSDs will cost you.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

Mostly cuz I couldn't remember where she's supposed to be and the in the next office over just moved here from Alaska. And we were talking about a flight on Alaskan airlines earlier this morning so I have Alaska on my mind. I'll go make a quick edit to not offend my Alaskan buddies.

Since I'm a little behind on some of this, can someone tell me what the current number of write/rewrite's is on the SSD's? I may be ignorant of current technology, but my thought was that this was a major issue once upon a time.

Mostly cuz I couldn't remember where she's supposed to be and the in the next office over just moved here from Alaska. And we were talking about a flight on Alaskan airlines earlier this morning so I have Alaska on my mind. I'll go make a quick edit to not offend my Alaskan buddies.

If you owned a major stock market exchange, or a massive country-wide utility, or a global shipping and logistics company, or a billion-dollar research institution, would you trust your data to a handful of consumer-grade hard drives?

That would be on the list of things I didn't give a shit about if I were a billionaire.

As I understand it (probably not too well) SLC is the "best" NAND, but is it really worth $1000 per GB when a run-of-the-mill SSD is only $1/GB? Why not strap a bunch of 840 Pros together?

If you owned a major stock market exchange, or a massive country-wide utility, or a global shipping and logistics company, or a billion-dollar research institution, would you trust your data to a handful of consumer-grade hard drives?

I would. But then again I am a cynical computing professional.

I am not a member of the upper level of the corporate aristocracy.

As a big fat aristocrat, I would probably trust another aristocrat rather than my own people.

If you owned a major stock market exchange, or a massive country-wide utility, or a global shipping and logistics company, or a billion-dollar research institution, would you trust your data to a handful of consumer-grade hard drives?

That would be on the list of things I didn't give a shit about if I were a billionaire.

At that point you're not playing with your own money-- if you're OK being sued by all of your shareholders or customers when you lose their data because you didn't adequately fund your IT department, I guess that's your problem.

When you get subpoenaed and are asked "Did you take adequate preemptive measures to ensure the uptime of your mission-critical systems?" you probably don't want to have to answer "No, I just bought it all on Newegg and hoped for the best."