The future of flash memory: tiny (and extremely tough to build)

In the past few weeks, we've looked at how solid state disks work, discussing the practical effects of reduced latency on the computing experience and walking through a bit about the nature of NAND flash and floating gate transistors. We've also talked about flash's impact on the overall direction of the mobile space and some ways in which modern operating systems are adapting to SSDs. Last week, we asked you all to tell us about what SSDs do for you, both at home and at work. This week, in the fourth and final entry in our feature series on the solid state revolution, we turn our gaze to the future. What's ahead for flash?

SSD Revolution

Flash memory continues to shrink in size and grow in capacity. Hard disk drive technology continues its inevitable march toward greater areal densities, and hybrid drives are being purchased in greater numbers. Hewlett-Packard is busy at work on a new type of storage, one based on fancy little things called memristors, which may hit the market in the mid-term.

There's a lot happening with solid state storage, and a lot more set to happen—but some serious problems need to be solved first.

It's getting small in here

For electronics, smaller is almost always better. Moore's Law—which says that the amount of transistors one can cram into a given amount of space tends to double about every eighteen months—still holds roughly true for NAND flash. SSDs based on a 25nm or 20nm manufacturing process are common today, and in early June, Toshiba announced a line of SSDs based on a 19nm process size.

The advantages of smaller flash are huge, since material costs money—and more flash can go on a single chip, which most obviously means bigger drive sizes. Even better, smaller floating gate transistors use less electricity and operate more efficiently, and gains in power efficiency are extremely important for flash memory's most important growth area: the mobile market.

When coupled with the eventual switch to TLC (triple-level cell) flash over the current MLC (multi-level cell) flash, the future should look rosy: physically smaller, higher-capacity SSDs which use less power. We can't lose!

Resistance becomes futile

Ah, but things are never that simple. SSDs have that proverbial Achilles' heel: the more writes their cells experience, the closer those cells get to dying.

As we discussed at length in the first article in this series, a flash cell is a specific type of transistor called a floating gate transistor. It has a specific feature (the actual floating gate) into which electrons can be stuffed to alter the transistor's threshold voltage, which is the potential difference that must be applied to the transistor in order for it to conduct electricity. The presence or absence of charge in the floating gate, and hence the existence of a high or low voltage threshold, determines whether or not the flash cell holds a 0 or a 1.

However, the process of coaxing electrons back out of the floating gate once they've been coaxed into it requires quite a bit of voltage, and some electrons are left behind in the floating gate every time. These extra electrons aren't a problem at first, but over the life of the flash cell, their presence can substantially alter the cell's electrical resistance. When a cell gets written to, it must be able to pull electrons into its floating gate in a very small amount of time; as its resistance becomes greater, the controller has to use greater and greater amounts of current to get the electrons to jump into the cell. Eventually, the amount of current required becomes so high, and the amount of time it takes for the electrons to jump into the cell becomes so long, that the cell can't be written to anymore.

As flash cell process sizes decrease and the cells themselves get smaller, we still face that fundamental problem. Worse, it's a problem that shrinking manufacturing process size actually makes worse. As flash cells get smaller, they retain a commensurately smaller amount of residual charge before they must be marked by the controller as useless. This dampens the enthusiasm for smaller flash chips quite a bit.

The smaller the flash cells get, the less residual charge it takes before they need to be marked as bad.

The number of bits stored in a flash cell makes a difference here, too. The most basic and reliable form of flash, as we've previously discussed, is SLC (Single Level Cell) flash, which stores a single bit in each floating gate. If the gate holds a charge, it represents a 0. If the gate holds little or no charge, it represents a 1. SLC flash is fast to read from and write to, since the cells just have to report on whether or not they contain a charge. More importantly, SLC flash cells can sustain the greatest number of writes, because pumping electrons into and out of the cell doesn't have to be done with great finesse and attention to differing charge levels. The entire process is less sensitive to the cell's accumulated charge.

Multi-Level Cell (MLC) flash stores two bits to SLC's one, and it does this by having not just "no charge" versus "some charge," but rather by having four discrete voltage levels—little to no charge for 11, some charge for 10, a bit more charge for 01, and even more charge for 00. With MLC, the flash cell becomes a lot more sensitive to changes in its resistance, because read and write operations have to be carried out more carefully—it's not just "pump in some charge," but instead pump in specific amounts of charge.

Additionally, the increased density of MLC guarantees that the individual NAND cells will undergo more writes for the same amount of data. The MLC drive has fewer NAND flash transistors for a given amount of storage, so a workload run on a 100GB SLC disk and on a 100GB MLC disk will make the MLC disk work harder; there are simply fewer transistors to bear the write load. The difference is dramatic, too. The average cell life in an SLC SSD is 100,000 writes, while a good 20nm MLC backed by a good controller will bear perhaps 3,000.

Can the situation get even crazier in future? It certainly can. TLC flash—that's "Triple Level Cell"—cranks up the density even further, storing three bits in each cell. This is done by keeping track of eight different voltage levels per cell—000, 001, 010, 011, 100, 101, 110, and 111. TLC cells have been around for quite some time but aren't yet used in consumer SSDs because there are significant engineering challenges in making them work well. For one thing, error correction algorithms need to be modified; for another thing, they're currently only good for a few hundred program/erase cycles. In order for TLC flash to be viable in consumer devices, SSD controllers will need to be incredibly stingy with writes and will have to take extraordinary measures to keep write amplification to a minimum.

TLC flash should enable bigger SSDs at lower costs, but the engineering challenges are formidable. And it gets worse when we consider the possible limits to flash technology.

Lee Hutchinson
Lee is the Senior Technology Editor at Ars and oversees gadget, automotive, IT, and culture content. He also knows stuff about enterprise storage, security, and manned space flight. Lee is based in Houston, TX. Emaillee.hutchinson@arstechnica.com//Twitter@Lee_Ars

Fantastic article. I must be getting mellower in my old age. I could care less about seeing faster SSDs, I just want more storage, even if it means the physical size of the device is larger. Give me 512MB in a device the size of an "ancient" 3.5" drive and I'll be perfectly happy.

Fantastic article. I must be getting mellower in my old age. I could care less about seeing faster SSDs, I just want more storage, even if it means the physical size of the device is larger. Give me 512MB in a device the size of an "ancient" 3.5" drive and I'll be perfectly happy.

You're definitely showing your age =P I hope you meant 512GB because we can fit 64GB in the space of a fingernail nowadays =)

The real future? A many core CPU + many core GPU + several TB of non-volatile L3 cache implemented with memristor RAM. Who wants disks, when everything can already be loaded? All your files is already open! Disks will then be for seldom used archival purposes, at which point speed will be a moot point, while capacity and super high reliability will be key, think SSD cartridges with 200 year or better retention.

The near future? I would like an SSD with 128GB of SL flash, and a good routine to demote the oldest blocks onto 512 GB of ML flash, which could also hold a partition for direct dumping of data that I know is read only, like most bulk media content. I wouldn't even mind it if there was a program that managed that migration, so it could use more intelligent cues to decide what to demote into the ML flash.

I also wouldn't mind SSD's that use gigabit ethernet for the interface, and focus on reliability rather than speed. I don't know why nobody seems to do that, it would also be great in spinny disks. Would make for really easy NAS builds.

<snip>The near future? I would like an SSD with 128GB of SL flash, and a good routine to demote the oldest blocks onto 512 GB of ML flash, which could also hold a partition for direct dumping of data that I know is read only, like most bulk media content. I wouldn't even mind it if there was a program that managed that migration, so it could use more intelligent cues to decide what to demote into the ML flash.

<snip>

++This

Tiering is definitely the future. Cache already does it, so why not SSDs?

Ive never understood why they don't make stack-able flash chips. They could place pass through BGA contacts on both the top and bottom of the package with a few shifted enable lines. That way, the flash chips could be soldered one on top of the other.

Fantastic article. I must be getting mellower in my old age. I could care less about seeing faster SSDs, I just want more storage, even if it means the physical size of the device is larger. Give me 512MB in a device the size of an "ancient" 3.5" drive and I'll be perfectly happy.

Yup. Same here. I don't see any real life benefit between an older SSD (~220MB/s) and the new ones (>500M/s). But they're both heck of a lot faster than any hard drive so I'd love to see 2TB+ SSDs that are not necessarily any faster than last generation but affordable and reliable.

In the past few weeks, we've looked at how solid state disks work, discussing the practical effects of reduced latency on the computing experience and walking through a bit about the nature of NAND flash and floating gate transistors. We've also talked about flash's impact on the overall direction of the mobile space and some ways in which modern operating systems are adapting to SSDs. Last week, we asked you all to tell us about what SSDs do for you, both at home and at work. This week, in the fourth and final entry in our feature series on the solid state revolution, we turn our gaze to the future. What's ahead for flash?

The URL in the above blurb from the article gets a 404 for me... What is the correct link?

Thank you for this, a very reasonable update to my knowledge from 2010 on flash. I do think you take a rosier view than many, but not to an extreme level. Also, while I am not holding my breath for memristers, at the same time HP promised '18 months' just six months ago, and in that report they stated that Samsung was about six months behind them. That puts its arrival well ahead of the typical '3-5 years' but as with anything from HP these days, skepticism is warranted.

"Read latency at the 6.5 nm process size rises significantly, becoming almost twice as high as it is today, and write latency becomes almost three times as high. Remember that low read and write latency is SSD's primary advantage over spinning disk, so an increase in latency drastically reduces the value of SSD over regular hard disks and makes their increased cost a lot tougher to swallow."

Hard drives have limited mechanical heads.SSD's ... don't.

If -any- metric of the SSD starts seriously being a problem, we'll see a RAID-like structures aimed at fixing them - inside the "drive". And... what's preventing more 'pure striping' inside the chips? Or, at least, arranging the innards of the chips to facilitate this usage from the controller? The article itself discusses drive-internal RAID-like possibilities for other aspects.

All you're effectively doing is "adding more read/write heads". They aren't mechanical. There are no platters. There's a threshold beyond which it isn't useful or practical to add more 'heads'. But there isn't the same set of physical constraints.

There's an initial hurdle of overcoming the added complexity. And, of course, this does nothing for "read/write a single bit" speeds. But it does such crushing things to reading blocks, stretches of blocks, and streams of data we'll get there eventually. (Unless, of course, this type of memory is abandoned for something better along the way. Naturally.)

Ive never understood why they don't make stack-able flash chips. They could place pass through BGA contacts on both the top and bottom of the package with a few shifted enable lines. That way, the flash chips could be soldered one on top of the other.

It gets depressing reading articles like this, b/c while we get all excited about something that's newer/faster (flash drives), we're now reminded of some kind of limitation and about how something newer/faster/better may be just around the corner. It makes a person start to wonder if they're really investing or waste dollars in current tech.

With that said, you can pry my ssd out of my cold dead hands. I don't regret buying one at all. But, every time new tech comes out I wonder if we're looking at a new zip drive or something .. a flash in the pan that just turns out to be a waste of money or a small stepping stone before something else more cool quickly replaces it.

"Read latency at the 6.5 nm process size rises significantly, becoming almost twice as high as it is today, and write latency becomes almost three times as high. Remember that low read and write latency is SSD's primary advantage over spinning disk, so an increase in latency drastically reduces the value of SSD over regular hard disks and makes their increased cost a lot tougher to swallow."

Hard drives have limited mechanical heads.SSD's ... don't.

If -any- metric of the SSD starts seriously being a problem, we'll see a RAID-like structures aimed at fixing them - inside the "drive". And... what's preventing more 'pure striping' inside the chips? Or, at least, arranging the innards of the chips to facilitate this usage from the controller? The article itself discusses drive-internal RAID-like possibilities for other aspects.

All you're effectively doing is "adding more read/write heads". They aren't mechanical. There are no platters. There's a threshold beyond which it isn't useful or practical to add more 'heads'. But there isn't the same set of physical constraints.

There's an initial hurdle of overcoming the added complexity. And, of course, this does nothing for "read/write a single bit" speeds. But it does such crushing things to reading blocks, stretches of blocks, and streams of data we'll get there eventually. (Unless, of course, this type of memory is abandoned for something better along the way. Naturally.)

This is actually what is already done. Individual cells are not actually that fast. SSD's are fast in part because they are parallel devices. The numbers he is talking about here actually take that approach into account because it is actually the default way that SSD's today work.

I just upgraded from a 128GB SSD to a 256GB model (which was also slightly faster). I paid $310 for the 128GB SSD in May 2011, and $190 (on sale) for the 256GB model at the end of June. Prices are dropping again by a large margin. If you haven't looked in a while at an SSD check again.

The real future? A many core CPU + many core GPU + several TB of non-volatile L3 cache implemented with memristor RAM. Who wants disks, when everything can already be loaded? All your files is already open! Disks will then be for seldom used archival purposes, at which point speed will be a moot point, while capacity and super high reliability will be key, think SSD cartridges with 200 year or better retention.

I wanted to ask about this... are there any techs (memristors?) that would be able to replace volatile with non-volatile RAM? Not today, but theoretically/on the roadmap?

While we would probably have to virtualize separate "RAM" and "disk" partitions at least for a while, it would definitely make for some interesting architectural changes. I imagine it would be very welcome in smartphones, but what enthusiast wouldn't want a desktop machine like that?

This is actually what is already done. Individual cells are not actually that fast. SSD's are fast in part because they are parallel devices. The numbers he is talking about here actually take that approach into account because it is actually the default way that SSD's today work.

Exactly.

Except I get the sense you're disagreeing with the possibility of "more".

Why?

I mean, if you're Samsung (or whomever), and actually manufacturing the chips, what's stopping one from basically extending the striping yet-another-level on an as needed basis?

The obvious factor is cost, but it's presented as more of an insurmountable physical hurdle than an economic one.

Haha can't understand what you mean by 3-5 years... Or wait I do thanks to xkcd (http://xkcd.com/678)

And a good idea is to spread the kinds of flash, a ~100GB SLC, ~500GB MLC and ~2TB TLC... to make it durable and ridiculously fast and loads of storage. I guess it's possible but at what price range...

But it's great that SSDs are coming down in price and get in the masses minds once you've gone SSD you are going to hate HDDs... At least I did...If you also add an OS that's actually made to work with SSDs from the beginning (like Win8 or perhaps Mountain Lion) it's going to be insane speeds and perhaps some new ways of using them... more than starting your browser 10s after pushing the on button...

Just off the top of my head, if they manufactured the circuitry of the cells a little more resilient such that it could withstand (relatively) high voltages over a period of time to restore the bad cells to operation. I would find it perfectly acceptable to be able to restore a SSD once a year to a new-like state if all I had to do was backup the SSD, move a jumper (poor forgotten jumpers), and let it sit there clearing out residual charge for 30 minutes.

This is actually what is already done. Individual cells are not actually that fast. SSD's are fast in part because they are parallel devices. The numbers he is talking about here actually take that approach into account because it is actually the default way that SSD's today work.

Exactly.

Except I get the sense you're disagreeing with the possibility of "more".

Why?

I mean, if you're Samsung (or whomever), and actually manufacturing the chips, what's stopping one from basically extending the striping yet-another-level on an as needed basis?

The obvious factor is cost, but it's presented as more of an insurmountable physical hurdle than an economic one.

What I'm saying is that 'more' is factored into these types of predictions already. The amount of 'more' is not infinite, it adds cost and there are technical barriers, so a reasonable amount of 'more' is already taken into account with estimates like these.

My guess for the future of storage is ferroelectrics. Like ferromagnetics (hard drives) they're non volatile, like current solid state they shouldn't have any moving parts. They may also be a lot more durable than either current solid state and phase change memory in terms of how long they last. Also they can be flipped from 1 to 0 at will, AND current estimates place the maximum density of circuitry spacing at as little as 5 nanometers, which is friggen awesome!

I don't know that I trust SSDs enough yet. I've only probably had 1 or 2 regular spinning drives die on me in the last 15 years or so. I've had 2 SSDs die on me in just the last 4 months. So far I've been lucky in that I've had disk images that I could recover to, but it definitely hasn't been a confidence builder.

I don't know that I trust SSDs enough yet. I've only probably had 1 or 2 regular spinning drives die on me in the last 15 years or so. I've had 2 SSDs die on me in just the last 4 months. So far I've been lucky in that I've had disk images that I could recover to, but it definitely hasn't been a confidence builder.

In my experience, with literally hundreds of flash devices, failure rates are considerably higher on SSD's in general than they are with HDD's. That said, Intel and Samsung SSD's have very good reliability, even compared to HDD's. The same cannot be said for other brands, such as OCZ or Crucial, however.

I don't know that I trust SSDs enough yet. I've only probably had 1 or 2 regular spinning drives die on me in the last 15 years or so. I've had 2 SSDs die on me in just the last 4 months. So far I've been lucky in that I've had disk images that I could recover to, but it definitely hasn't been a confidence builder.

Unless the SSDs were OCZ products (you knew better than to buy from them, though, right?), your experience is a largely irrelevant anecdote. Unless the whole line is defective in some manner (hello, OCZ), the failure rate of SSDs is, last I heard, lower than that of spinning-disk HDDs.

One or two of something isn't enough to meaningfully judge reliability of the whole class..

Unfortunately, other metrics get a lot less rosy. Read latency at the 6.5 nm process size rises significantly, becoming almost twice as high as it is today, and write latency becomes almost three times as high. Remember that low read and write latency is SSD's primary advantage over spinning disk, so an increase in latency drastically reduces the value of SSD over regular hard disks and makes their increased cost a lot tougher to swallow.

I do have a bit of a problem with this - read and write latency on SSDs is hundreds or thousands of times lower than it is on mechanical hard disks. If SSD latency doubles or even triples, SSD latency will still be hundreds or thousands of times lower than hard disks. The difference is still dramatic and still makes a massive difference to how fast your computer *feels* (not to mention all the server/database advantages that remain).

Unless the SSDs were OCZ products (you knew better than to buy from them, though, right?), your experience is a largely irrelevant anecdote. Unless the whole line is defective in some manner (hello, OCZ), the failure rate of SSDs is, last I heard, lower than that of spinning-disk HDDs.

One or two of something isn't enough to meaningfully judge reliability of the whole class..

Yeah no OCZ (I did my due diligence when shopping... )I currently have a Corsair Force 3 series running in a netbook with Win7 (which BSODs on hibernate -- so I disabled hibernate, so far it's running okay outside of that glitch), a Kingston 96GB V100+ which had to be RMA'd due to freezing boot issues on XP, and now a Kingston 90GB HyperX series which is about to be RMA'd due to intermittent freezing boot issues. I love the speed increase, just not all the headaches...

(And yes, before anyone brings it up, I have looked for firmware upgrades for these drives and there have not been any released that I am aware of.)

Unless the SSDs were OCZ products (you knew better than to buy from them, though, right?), your experience is a largely irrelevant anecdote. Unless the whole line is defective in some manner (hello, OCZ), the failure rate of SSDs is, last I heard, lower than that of spinning-disk HDDs.

One or two of something isn't enough to meaningfully judge reliability of the whole class..

Yeah no OCZ (I did my due diligence when shopping... )I currently have a Corsair Force 3 series running in a netbook with Win7 (which BSODs on hibernate -- so I disabled hibernate, so far it's running okay outside of that glitch), a Kingston 96GB V100+ which had to be RMA'd due to freezing boot issues on XP, and now a Kingston 90GB HyperX series which is about to be RMA'd due to intermittent freezing boot issues. I love the speed increase, just not all the headaches...

(And yes, before anyone brings it up, I have looked for firmware upgrades for these drives and there have not been any released that I am aware of.)

I seem to remember that series of Corsair having a known controller error that causes BSOD's and random drive failure. I'd have to look it up again though.

Just off the top of my head, if they manufactured the circuitry of the cells a little more resilient such that it could withstand (relatively) high voltages over a period of time to restore the bad cells to operation. I would find it perfectly acceptable to be able to restore a SSD once a year to a new-like state if all I had to do was backup the SSD, move a jumper (poor forgotten jumpers), and let it sit there clearing out residual charge for 30 minutes.

I'm not up on the latest in device physics or flash device structures, but based on what I know, it's not gonna happen.

You can have density or you can have voltage tolerance -- you can't have both. Voltage tolerance requires thicker oxide layers in the process. There may be a way to have two different oxide thicknesses (I doubt it), but it would definitely pump up the cost.

There is also the issue of electrons that get trapped in crystal defects in the floating gates. There's no way to get these out short of hitting it with a lightening bolt equivalent.

The reality is that it's probably best to budget for a new SSD every few years and just replace it before it dies. Even better would be to use your SSD as a single drive for a year (or two) then add another SSD in a RAID 1 configuration and then every year (or two) after that replace the older drive. The ethically-challenged will probably sell these old drives on Craigslist.

While I consider myself a techie/geek (not as much geek cred as some, but my first computer was back in 1983, first connected via 300 baud, then 1200, then 2400, then 14.4K, etc..), lately I've hit a "good enough" wall.

We have 2 HTPCs, 2 laptops, a "main" machine for me, and a file server. My machine is a q9550 w/ 8 GB. File server has 1 TB raid 1+0 (I think), on 5400 rpm Hitachi Deskstars. Throughput is 40-50 MB/s (house is wired for Gigabit ethernet) when copying files. Streaming video takes only a fraction of that. File coyping on a large scale doesn't happy very often - one of the laptops has GBE, the other one is the "slowpoke" - and 100 MB isn't that bad.

Laptops have HDs, but they usually hibernate/resume, and that's only a few seconds. I wouldn't mind the power and speed improvements of an SSD, but first I want to find one at a reasonable price where 25% of the feedback isn't "died after 6 months" or "blue screens every other day". Reliability/compatability is #1, #2 and #3 for me for buying SSDs. Price is secondary - I'll pay 2x or 3x of HD prices if I can be sure it won't be dead in 6 months.

I'm leaning towards more/faster storage myself. One thing that is left out of this article is DRAM prices. I just bought 32GB's of DRAM for about $5 per GB. For the first time ever, I've had more RAM than I can realistically use for my applications. That's still 5x as expensive as SSD, but anything within an order of magnitude in prices tends to raise the issue of whether or not DRAM will ever provide an alternative to SSD for impoving IO performance.

I know the OS is already using the extra RAM as a cache, but the problem with that is that the OS isn't tuned to only use the RAM for disk cache. Also, the OS has to deal with bottlenecks and latency between the drive and the CPU. Since the OS's caching is general purpose, we probably won't see as much of an IO benefit as we would if some RAM were specifically at the service of a specialized controller located "near" the hard disk. Also, a performance optimized, power sipping controller could handle flushing the cache in the event of a power failure. I'll ask the hardware engineers, could large amounts of onboard DRAM (much more than the 32MB seen on typical mechanical drives), combined with a controller chip that is specifically tailored to hard drive caching, and finally a small battery for handling power failures, be a better solution than SSD going forward? What are the physical limits of RAM size in comparison to flash RAM size?