Friday, April 17, 2009

Talk on Codes for Flash Memory

While here at SIGCOMM, I gave a talk at University College London on some new results for codes for flash memory. Here are the slides. Although it was a "theory" talk to a "systems" audience, it seemed to go well, I think because people are generally interested in flash memory and how it works.

As a reminder (also discussed at this old post) flash memory is strange. As a simplified model, you have blocks of cells, where each cell can take on a value in the range 0 to q-1 (where q is currently pretty small -- say 4, but seems to be growing with technology), and a block can consist of thousands (or more) of cells. Reads of cells are fast. When writing, you can INCREASE the value of a cell, but in order to decrease any cell value, you first have to ERASE (set back to 0) the entire block. And erases are bad -- it wears out the memory, and the number of erases is the primary factor in the memory lifetime.

So, for example, if you have a block of 6 cells, and q = 4, it's "easy" to make the change

0 2 1 3 1 2 -> 2 2 1 3 1 2

but "hard" to make the change

0 2 1 3 1 2 -> 0 2 1 1 1 2

since you'd have to erase to change that 3 back to a 1. In this setting, when talking about a code, I was not talking about error-correction, but rather data representation: how can we use blocks of cells to represent data effectively while minimizing erasures.

The talk had an overarching question, and some specific results.

1) The big question: how does flash memory change things, at the scientific level? Should we be designing algorithms and data structures specifically for flash? If so, what should these things look like? Similarly, what about memory hierarchies that utilize flash at one or more levels? [This seemed to be the part that interested in the systems people.]2) Some deterministic, worst-case codes: this work was primarily the senior thesis work of Hilary Finucane, who is heading to Weizmann next year. We've made it available as a Technical Report which can be found here.3) Average-case codes. We've actually extended our previous analysis from Allerton and submitted a journal version; at some point I'll put it up as a preprint (but mail me if you're interested).

I think flash memory is a potentially interesting area that seems to have slipped in under the radar of many people. After getting into it by way of codes, I'm currently trying to learn more and see if there really are worthwhile algorithmic questions there.

4 comments:

I would answer (1) in the affirmative. On a theoretical level I'm interested to see what happens when the computational model includes the idea of wear and tear, particularly with such a counterintuitive rule for when wear happens. On a practical level it will be a shame, for consumers and the environment, if all these solid state disks wear out prematurely because inadequate effort is put toward conserving them.

My experience with the NAND flash based Solid State disks (and I experimented with many different brands of SSDs for more than a year) is that the recent models only allow a (small) constant number of write operations before forcing an ERASE. This is also observed in Woodhouse 2001.

Also, the wear levelling techniques are now in-built with the device micro-controller. This means that you can no longer control the physical location where the write will take place. On the plus side, this ensures that the writes are more even-out. So, unless you continously write on the device for more than 30 years (and this figure is improving), the erasures are unlikely to cause any serious memory corruption.