SSD Revolution

Way back in 1997, when dinosaurs roamed the earth and I was working part-time at the local Babbage's for $4.25 an hour, I scraped together enough spare change to purchase a 3Dfx Voodoo-based Diamond Monster 3D video card. The era of 3D acceleration was in its infancy and the Voodoo chipset was the chipset to beat. It all seems a bit silly now, but when I slapped that sucker into my aging Pentium 90 and fired up the new card's pack-in version of MechWarrior 2—which had texture-mapping and visual effects that the original 2D version lacked—my jaw hit the floor. I couldn't wait to speed-dial my buddy Matt and tell him that his much-faster Pentium 166 no longer brought all the polygons to the yard.

That video card was the most important PC upgrade I ever made, sparking a total change in my perception of what computers could do. I didn't think I would ever again experience something as significant as that one single upgrade—until the first time I booted up a laptop with a solid-state drive (SSD) in it. Much like that first glimpse of a texture-mapped MechWarrior 2, that first fast boot signaled a sea change in how I thought and felt about computers.

The introduction of 3D graphics changed our perceptions of computing not because it made colors brighter or virtual worlds prettier—though it did those things and they are awesome—but because it made a smoothly responsive 30 and 60 frames per second gaming experience a standard. Solid-state drives have a similar effect. They're faster than spinning disk, to be sure, but their most important contribution isn't just that they are faster, but rather that they make the whole computer feel faster. They remove barriers between you and your PC, in effect thinning the glass between you and the things that you're doing with and through your computer.

Solid-state drives are odd creatures. Though they sound simple in theory, they store some surprisingly complex secrets. For instance, compare an SSD to a traditional magnetic hard drive. A modern multi-terabyte spinning hard disk plays tricks with magnetism and quantum mechanics, results of decades of research and billions of dollars and multiple Nobel Prizes in physics. The drives contain complex moving parts manufactured to extremely tight tolerances, with drive heads moving around just thousandths of a millimeter above platters rotating at thousands of revolutions per minute. A modern solid-state drive performs much more quickly, but it's also a more mundane on the inside, as it's really a hard drive-shaped bundle of NAND flash memory. Simple, right?

However, the controller software powering an SSD does some remarkable things, and that little hard drive-shaped bundle of memory is more correctly viewed as a computer in its own right.

Given that SSDs transform the way computers "feel," every geek should know at least a bit about how these magical devices operate. We'll give you that level of knowledge. But because this is Ars, we're also going to go a lot deeper—10,000 words deep. Here's the only primer on SSD technology you'll ever need to read.

Varying degrees of fast

It's easy to say "SSDs make my computer fast," but understanding why they make your computer fast requires a look at the places inside a computer where data gets stored. These locations can collectively be referred to as the "memory hierarchy," and they are described in great detail in the classic Ars article "Understanding CPU Caching and Performance."

It's an axiom of the memory hierarchy that as one walks down the tiers from top to bottom, the storage in each tier becomes larger, slower, and cheaper. The primary measure of speed we're concerned with here is access latency, which is the amount of time it takes for a request to traverse the wires from the CPU to that storage tier. Latency plays a tremendous role in the effective speed of a given piece of storage, because latency is dead time; time the CPU spends waiting for a piece of data is time that the CPU isn't actively working on that piece of data.

The table below lays out the memory hierarchy:

Level

Access time

Typical size

Registers

"instantaneous"

under 1KB

Level 1 Cache

1-3 ns

64KB per core

Level 2 Cache

3-10 ns

256KB per core

Level 3 Cache

10-20 ns

2-20 MB per chip

Main Memory

30-60 ns

4-32 GB per system

Hard Disk

3,000,000-10,000,000 ns

over 1TB

At the very top of the hierarchy are the tiny chunks of working space inside a CPU where the CPU stores things it's actively manipulating; these are called registers. They are small—only a few hundred bytes total—and as far as memory goes, they have the equivalent of a Park Avenue address. They have the lowest latency of any segment of the entire memory hierarchy—the electrical paths from the parts of the CPU doing the work to the registers themselves are unfathomably tiny, never even leaving the core portion of the CPU's die. Getting data out in and out of a register takes essentially no time at all.

Adding more registers could potentially make the CPU compute faster, and as CPU designs get more advanced they do indeed tend to gain more (or larger) registers. But simply adding registers for the sake of having more registers is costly and complicated, especially as software has to be recompiled to take advantage of the extra register space. So data that the CPU has recently manipulated but that isn't being actively fiddled with at the moment is temporarily placed one level out on the memory hierarchy, into level 1 cache. This is still pricey real estate, being a part of the CPU die, but not as pricey as the registers. In a modern CPU, getting data out of the L1 cache takes three or four cycles (typically around a nanosecond or so) compared to zero cycles for the registers. The trade-off for that slower performance is that there's a lot more space in this tier—up to 32KB of data per CPU core in an Intel Ivy Bridge i7 CPU.

Data that the CPU expects to access again shortly is kept another level out, in the level 2 cache, which is slower and larger, and which carries still more latency (typically between 7 and 20 cycles).

Modern CPUs have level 3 caches as well, which have higher latencies again, and which can be several megabytes in size.

Even further down the hierarchy is the computer's main memory, which has much higher effective latency than the CPU's on-die cache. The actual RAM chips are rated for very low latency (DDR2 DRAM, for example, is typically rated for five nanoseconds), but the components are physically distant from the CPU and the effective latency is therefore higher—usually between 40 and 80 nanoseconds—because the electrical signals from the CPU have to travel through the motherboard's traces to reach the RAM.

At the bottom of the hierarchy sits our stalwart hard disk, the repository of all your programs, documents, pictures, and music. All roads lead here. Any time a program is executed, an MP3 is played, or any kind of data needs to be viewed or changed by you, the user, the computer calls on the disk to deliver it up.

Disks these days are large, but they are also glacially slow compared to the other tiers in the memory hierarchy, with latency a million times higher than the previous tier. While waiting for main memory to respond, the processor might have nothing to do for a few dozen cycles. While waiting for the disk to respond, it will twiddle its thumbs for millions of cycles.

Worse, the latency of a spinning hard disk is variable, because the medium itself is in motion. In order to start an application like, say, Google Chrome, the hard disk may have to read data from multiple locations, which means that the drive heads have to seek around for the right tracks and in some cases even wait whole milliseconds for the correct blocks to rotate underneath them to be read. When we're defining latency in terms of billionths of a second in previous tiers, suddenly having to contend with intervals thousands of times larger is a significant issue. There are many tricks that modern computers and operating systems do to lessen this latency, including trying to figure out what data might be needed next and preemptively loading that data into RAM before it's actually requested, but it's impossible to overcome all of the latency associated with spinning disks.

On one hand, human beings like us don't operate in terms of milli-, micro-, or nanoseconds, at least not without the aid of serious drugs. A thousandth of a second is the same to us as a billionth of a second—both are intervals so small that they might as well be identical. However, with the computer doing many millions of things per second, those tiny fractions of time add up to very real subjective delays, and it can be frustrating when you click on Microsoft Word and stare at a spinning "Please wait!" cursor for seconds at a time. Waiting on the computer while it drags something off of a slow hard disk is disruptive to workflow and can be a jarring experience, especially if you've got a rapidly derailing train of thought barrelling through your head that you need to write down.

Solid-state drives provide an immediate boost to the subjective speed of the computer because they take a big chunk out of the largest amount of latency you experience. Firstly and more obviously, solid-state drives don't have moving heads and rotating platters; every block is accessible at the same speed as every other block, whether they're stored right next to each other or in different physical NAND chips. Reading and writing data to and from the solid-state drive is faster as well, so not only does the computer have to wait fewer milliseconds for its requests to be serviced, but the solid-state drive can also effectively read and write data faster. Quicker responses (lower latency) plus faster transfer speeds (more bandwidth) mean that an SSD can move more data faster—its throughput is higher.

Even just halving the latency of a spinning disk (and SSDs typically do far more than that) provides an incredible subjective improvement to the computing experience. Looking at the higher tiers in the memory hierarchy, it's easy to see why. If, for example, a sudden breakthrough in RAM design decreased the effective latency to and from a system's RAM by a factor of 10x, then calls to and from RAM would drop from a best case of 60ns to 6ns. Definitely impressive, but when looked at in terms of the total delay during an I/O operation from CPU to RAM to disk, there's still so much time spent waiting for the disk that it's an insignificant change. On the other hand, cutting the disk's effective latency from 5-10 milliseconds for a random read to less than a single millisecond for a random read—because any block of an SSD is always as readable as any other block, without having to position heads and wait for the platter to spin—you've just knocked out a tremendous percentage of the total amount of time that entire "CPU to RAM to disk" operation takes. In other words, the speed increases provided by an SSD are targeted right at the longest chain in the memory hierarchy.

Latency affects throughput by letting you read more data in a smaller amount of time. Here, the spinning disk spends most of its time waiting on the platter and heads to find the right data to be read.

Now, a solid-state drive isn't going to always be faster than a spinning hard disk. You can define and run benchmarks which highlight a hard disk's advantages over an SSD; a synthetic benchmark that repeatedly writes and rewrites data blocks on a full SSD without giving the SSD time to perform garbage collection and cleaning can overwhelm the SSD controller's ability to manage free blocks and can lead to low observed performance, for example (we'll get into what garbage collection is and why it's important in just a bit).

But in everyday use in the real world, when performing under an organic workload, there are almost no areas where simply having an SSD doesn't make the entire computer seem much faster.

Lee Hutchinson
Lee is the Senior Technology Editor at Ars and oversees gadget, automotive, IT, and culture content. He also knows stuff about enterprise storage, security, and manned space flight. Lee is based in Houston, TX. Emaillee.hutchinson@arstechnica.com//Twitter@Lee_Ars

185 Reader Comments

That video card was the most important PC upgrade I ever made, sparking a total change in my perception of what computers could do.

I got the same feeling when I switched my S3 Trio for a GeForce 256 (the first T&L graphics card according to nVidia). I literally spent the next few months drooling at the sight of nVidia's own demos of what GeForce could do.

Obviously, for home use, it's a good idea to use SSD for OS drives and traditional HDD for bulk data. It doesn't matter much if your MP3 collection or home movies have low latency access, but having a word processor fire up twice as fast helps make one's system feel much more responsive.

Very nice article. Might also be worth mentioning for home use the Seagate hybrid drives (Momentus XT, I think); several hundred gigabytes of spinning disk paired with 8 (IIRC) GB worth of SSD, plus a controller that tries to put the most frequently read chunks of data on the SSD portion. Not as fast as a pure SSD, but also nowhere near as expensive on a per-gigabyte basis.

Great to see the traditional Ars deep tech dives making a comeback. I don't know how accurate it is, but my perception is that there have been fewer of them since Hannibal slowed down and then moved on. Ars does the long form better than most on the web.

Similar nostalgia here, upgrading from 4MB Rage II+ to a 16MB VooDoo card. Went from 320x240 pixel-doubled (no kidding, a blurry mess) to 1024x768 Quake_GL in one swoop. Oh Mah Gawd. Oh to be a teen in the 90s. Wonderful memories. "Don't pick up the phone I'm onliiiii.... NOOOOOO!"

I'm slowly putting SSD's in everything now. I gave up on hoarding all kinds of 'stuff' long ago after who knows how many burnt cd's/dvd's.

I remember paying $500+ in 99 for a pair of brand spanking new cutting edge capacity 40 gig drives.... Along with my existing 20 that was 100(!) unbelievable gigs of pure leetness!

Now you can buy 1 or 2 TB's for $99. lol. Which will be equally as quaint in another 10 years of course.

Great article, but you missed a growing trend in the consumer space: hybrid drives. I recently put a 750G Seagate Momentus hybrid (750G spinning, 8G "magic" SSD cache) into my old MBP and it is night and day faster, without the price penalty of keeping things like mp3s and ebooks on expensive sub-ms latency storage.

There seem to be two basic flavors of hybrid right now. I got the 'magic' style, where the cache is managed automatically by the drive firmware (and is therefor completely agnostic to OS, encryption, filesystems, etc.) There is also an earlier type with a helper app that runs that pushes the core system libraries and such into the cache, effectively manually. (And I believe the first hybrids showed up as 2 full drives, but that is the worst of both worlds. Just buy a small independent SSD and save the headache.)

As with standard SSDs, apps seem to appear instantly, booting is faster, etc. (It takes a little while to heat up the cache with your most-used blocks/files/etc, but even before that it was quite fast.) Unlike the manual flavor, it adjusts automatically to your workload. (For example, why cache boot files instead of Chrome if you reboot once a month but restart Chrome daily?) Right now, only reads are cached - writes go straight to spinning disk - but Seagate has promised an upcoming firmware update that will fix that.

Timely article, I just finished building my new PC yesterday with an SSD for a main drive. Everything else is new, too (ok, I'm "recycling" an ATI HD3870, it's mostly new...bought it way back to set up crossfire, but I never played a game that took advantage of it), but with what I do, there's not much difference between an i5-2500K (new core) and my E6700 (old core).

And I'm now on a x64 OS, so I can take full advantage of my 4G ram as opposed to only 3.25G. But I never got close to hitting 3 in the old box.

But. Boot times? Maybe 10 seconds, tops. Down from a minute or so. And more notable a change simply because it's something I'm very familiar with, WoW zone load screens: half a minute to 2 seconds. The progress bar just zips across now, something no amount of defragging could ever fix. So awesome.

Dunno if "always on internet" really counts as a hardware upgrade. But in how I used my computer it mattered a lot, especially since analog phonelines in Denmark were metered, you paid per minute you used it. So a dialup modem wasn't "always on".

That's probably also why DSL caught on way faster in Europe. A 64kbs DSL line was something completely different than 56kbs dial-up, because the DSL was "always on".

For me, the hardware 3D thing was more gradual as the triumvirate of hardware, drivers, and software improved. SSD's were more of an overnight shock - suddenly I could buy one device that would make almost any computer feel 10x faster.

I skimmed the whole article and I'm looking forward to reading it slowly when I get home from work. It's a great summary for consumers (and pro-sumers) trying to understand the new technology.

I'm curious about why RAM disk adoption is so slow in the industry. The last page of the article mentions DRAM caching for enterprise systems, but I'm pretty sure that end users would also benefit from a huge speed bump.

New high-end laptops, desktops, and workstations ship with up to 16 to 32 GB of RAM – I'm sure that some of that memory could be usefully allocated to a RAM disk for application and page file caching. Adobe's Creative Suite applications imply this in their Scratch Disk nomenclature, for example.

That's why I found Microsoft's ReadyBoost so shockingly backward when it was first announced with Windows Vista. Flash memory via USB 2.0 is magnitudes slower than DRAM. Why are operating system developers so slow to integrate native RAM disk support?

I skimmed the whole article and I'm looking forward to reading it slowly when I get home from work. It's a great summary for consumers (and pro-sumers) trying to understand the new technology.

I'm curious about why RAM disk adoption is so slow in the industry. The last page of the article mentions DRAM caching for enterprise systems, but I'm pretty sure that end users would also benefit from a huge speed bump.

New high-end laptops, desktops, and workstations ship with up to 16 to 32 GB of RAM – I'm sure that some of that memory could be usefully allocated to a RAM disk for application and page file caching. Adobe's Creative Suite applications imply this in their Scratch Disk nomenclature, for example.

That's why I found Microsoft's ReadyBoost so shockingly backward when it was first announced with Windows Vista. Flash memory via USB 2.0 is magnitudes slower than DRAM. Why are operating system developers so slow to integrate native RAM disk support?

This is a topic that comes up once in a while over in the Ars Windows Technical Mojo discussion forum. There are a lot of extremely smart fellows that post regularly there, and the consensus is that RAM disks aren't more widely used because except in a few edge cases, system RAM is better used as system RAM. Rather than borrowing from one tier up in the memory hierarchy to bolster a lower tier, it's better to simply let that tier alone to do its thing. Unless you have just truly ridiculous quantities of RAM in a computer, RAM is more efficiently used as RAM; the corollary to that is that if you've the money to spend on truly ridiculous quantities of RAM which is going to sit there doing nothing unless you turn it into a RAM disk, you probably could have afford to buy a faster disk subsystem in the first place

The problem with ramdisks is that you already have several things doing that job: a block cache on the disk itself, a block cache in the OS, a filesystem cache plus private (process or application specific) copy of any 'active' file data. The OS (or drive firmware, for the disk cache) can manage them pretty well, and independently of specific applications.

Adding a forced ramdisk means that most (or all) of that ram is not available to the OS when needed, whereas the other caches can be flushed to disk (or just dropped if they are unchanged) and reused. Linux/*nix systems display the caches fairly well with 'free' or 'top'. So long as you aren't actively using the RAM, they will keep recently used blocks and files in memory so that going back to them is faster, but if the RAM is needed they can be freed immediately. (On a healthy Linux system, for example, you will generally see about 95% of memory 'used' but very little swap space. That just means that memory not used by the apps/kernel/etc is being used to hold a disk cache.)

There is a lot of complexity that I've skipped (CoW, shared pages, etc) but that is a rough explanation. If you want details I'll see if I can find a good summary article about it.

I'm curious about why RAM disk adoption is so slow in the industry. The last page of the article mentions DRAM caching for enterprise systems, but I'm pretty sure that end users would also benefit from a huge speed bump.

The operating system already manages a RAM-based cache. It's called the file system cache.

Quote:

That's why I found Microsoft's ReadyBoost so shockingly backward when it was first announced with Windows Vista. Flash memory via USB 2.0 is magnitudes slower than DRAM.

That video card was the most important PC upgrade I ever made, sparking a total change in my perception of what computers could do.

I got the same feeling when I switched my S3 Trio for a GeForce 256 (the first T&L graphics card according to nVidia). I literally spent the next few months drooling at the sight of nVidia's own demos of what GeForce could do.

Not sure if it was Torvalds or Carmack that recently stated that switching to a SSD was the single biggest upgrade one could do with ha computer these days.

(anyone who says the "I" stands for "Independent" needs to learn about RAID 0 data recovery)

Hmm, and there I would think the Lee would have problems with the word "redundant", not independent. If you put two disks in a RAID 0 array, they are still independent disks, but they are not redundant. (Perhaps that's why it was named 0) Strangely, Lee doesn't have any problem with the manufacturers using the word independent in their naming conventions - though, I don't think any of them are sticking them in a RAID 0 class configuration, so perhaps that is why.

Aside from grating on your insistence of 'inexpensive' being correct ('independent' makes much more logical sense, regardless of the history of the thing - RAID0 is an argument against 'redundant' more than it is against 'independent'), very nice article. I too am glad to see Ars coming back to the long form, technical articles. It's really a surprisingly simple system, just with a lot of moving parts, aside from the SandForce stuff it's all rather straightforward.

As several people have posted windows already does a bunch of different caching in unused ram. If you want a marketing term MS uses for Windows 7 they called it SuperFetch which is basically trying to cache applications it thinks you are going to launch before you actually launch them.

The only situation I've seen where using a ramdisk makes sense is when you are dealing with an application that can't properly utilize all the memory you have available. Back in the day I was programing in G in Labview for an internship http://en.wikipedia.org/wiki/LabVIEW. The version they had only required a couple MB's of ram to run and even tho the system had 16mb of ram (a lot at that time) it would always use the temp directory when compiling. With a couple mb ram drive setup and the Labview temp directory pointed to that it was significantly faster but that's only because it insisted on using the temp directory instead of the memory that was available.

Hmm, and there I would think the Lee would have problems with the word "redundant", not independent. If you put two disks in a RAID 0 array, they are still independent disks, but they are not redundant. (Perhaps that's why it was named 0) Strangely, Lee doesn't have any problem with the manufacturers using the word independent in their naming conventions - though, I don't think any of them are sticking them in a RAID 0 class configuration, so perhaps that is why.

RAID has always stood for "Redundant Array of Inexpensive Disks." The "I" has been bastardized of late and most folks say "Independent", but they're wrong. I can call a duck a cow, but that doesn't make it so.

RAID's opposite, which we don't hear much these days, is SLED--"Single Large Expensive Disk". When it first appeared in the late 80s, RAID was originally a cost-saving measure as much as anything else, since large hard disk drives at the time were extremely costly.

That video card was the most important PC upgrade I ever made, sparking a total change in my perception of what computers could do.

I got the same feeling when I switched my S3 Trio for a GeForce 256 (the first T&L graphics card according to nVidia). I literally spent the next few months drooling at the sight of nVidia's own demos of what GeForce could do.

Not sure if it was Torvalds or Carmack that recently stated that switching to a SSD was the single biggest upgrade one could do with ha computer these days.

Different times, different bottlenecks

At that time, graphics were seriously holding computers back. Even the Windows desktop benefited immensely from 2d acceleration. Later it was the RAM race, which seems to have stabilized at the 4 GB mark. I remember updating my laptop's 1GB to 4GB and it seemed like a new machine. The SSD is responding to the most pressing need nowadays: faster storage.

While CPUs haven't been on the line for some time in everyday tasks (remember the Mhz race?), I wonder if abundant RAM, generalized use of SSDs and enough graphics power for most tasks are going to shift the burden to the CPU again.

One thing that the article does not seem to cover is that many (but not yet that many) home PC users are starting to deploy Bitlocker or Truecrypt or other kind of full disk encryption. And even more users are forced to encrypt their laptops if they wish to work on them.

Using full disk encryption absolutely destroys SSD performance. There is zero use for a SSD if you do full disk encryption - it slows down to hard drives from 1996.

Another obvious problem with SSDs is that they can never be truly erased, which means all your passwords/pictures/etc. will be stored on them forever, and recoverable if you choose to sell your computer, discard it, or send it in for repairs.

RAID 0 argues against both redundant and independent. It makes the disks entirely dependent on one another, and entirely non-redundant.

I contest that: Redundant Array of Independent Disks, where independent refers to the disks, not the array itself. By arguing against independent, you are arguing that the disk is no longer an independent disk when in an array, which simply doesn't make any sense.