This site may earn affiliate commissions from the links on this page. Terms of use.

The Hybrid Memory Cube Consortium, which consists of such silicon luminaries as Micron, Samsung, and IBM (but not Intel), has finally finished hammering out the Hybrid Memory Cube 1.0 standard. The HMC is a complete paradigm shift away from conventional DDR1/2/3 SDRAM sticks (DIMMs), offering up to 15 times the performance of DDR3, while using 70% less energy. Just to whet your appetite, HMC 1.0 has a max bandwidth of 320GB/sec to a nearby CPU or GPU — PC3-24000 DDR3 SDRAM, on the other hand, maxes out at just 24GB/sec.

The Hybrid Memory Cube is essentially a stack of up to eight memory dies, connected to each other with through-silicon-vias (TSVs), sitting atop a logic and switching layer that controls input and output to all eight dies. This stacked approach is fundamentally different from DRAM, which generally consists of a bunch of RAM dies placed side-by-side on a stick. Almost all of HMC’s advantages over DRAM are due to the dies being stacked.

As we’ve covered before, chip stacking is the future of computing. By putting dies on top of each other, the wires between them are much, much shorter. In turn, this means that data can be sent at higher speed, while at the same time using less energy. There are a few different chip stacking methods, though, with some being far more advanced and powerful than others. The most basic is package-on-package (pictured above), which essentially takes two finished chips and places them on top of each other, with the connecting pins of the top chip fitting into the bottom chip. This approach is already being extensively used by smartphone SoCs, where a memory chip is stacked on top of the CPU/GPU, allowing the completed device to be significantly smaller.

The more advanced method of chip stacking uses through-silicon-vias (TSVs). With TSV, vertical copper channels are built into each memory die, so that when can be stacked on top of each other (pictured right). Unlike package-on-package, which sees two complete chips placed on top of each other, dies connected with TSV are all inside the same chip. This means the wires between the dies are as short as they can possibly be, and because each die is very thin, the complete package is only fractionally taller than normal. In theory, any number of dies can be connected this way, with heat generation and dissipation being the only real limitations. For now, it seems like the HMC 1.0 spec allows for up to eight dies, with a max addressable capacity of 8GB. There’s no reason you couldn’t multiple HMCs connected to a CPU or GPU, though, if you’re looking for more than 8GB of RAM.

Beyond the TSV, the other reason that the HMC is so much faster and more efficient is because it removes the logic transistors from each DRAM die and places them all in one central location, at the base of the stack. In conventional DRAM, each and every memory chip has its own logic circuitry, which is in charge of getting data in and out of the individual memory cells. Each of these logic circuits needs to be powerful enough to read and write at huge data rates, which costs a lot of power and adds a lot of complexity to the I/O process. In the HMC, there is just one logic circuit that drives all eight memory dies. This centralized logic allows for higher and more efficient data rates — up to 320 gigabytes per second, while consuming 70% less energy than DDR3. (See the full Hybrid Memory Cube spec on the Consortium’s site.)

The HMC Consortium consists of most major players in the chip industry, with the notable exception of Intel. Intel did collaborate with Micron when the Hybrid Memory Cube was first demonstrated at IDF in 2011, but for unknown reasons there are no TSV products on its roadmap. The consortium plans to launch the first HMCs later in 2013, and it is already working on version 2.0 of the HMC spec. There’s no word on cost, but we’ll probably see HMCs in supercomputers and networking devices first, where the ultra-high bandwidth will really come into its own, and then perhaps consumer devices in the next year or two.

Tagged In

The IO subsystem is getting interesting. First rotational HDDs (by far the slowest aspect) got replaced by SSDs. Now, traditional DRAM could be replaced by HMC. Makes me wonder where (if?) ReRAM will fit into this.

Per the APU comment, this is exactly what is needed. High bandwidth for graphics is crucial. Now, I wonder what kind of error correction is built into the spec – especially since they removed a lot of the logic that would do error checking ala DRAM modules with ECC.

matt_helm

From the spec, “ECC provides error coverage of the data from the vault controller to and from the DRAM arrays.”

matt_helm

Are they going to make a low power version like DDR has? Just wondering, and I saw nothing on their site about it.

It sounds like there are two versions — short reach (DIMM replacement) and ultra short reach (for RAM close to the CPU/GPU). Short reach is apparently the more energy efficient of the two. Both should be more efficient than low-power DDR, though, I think.

I’ll believe it when i see it, nice concept though, i really love idea of chip stacking, they should make it available for sales.

dwell lewd

‘…which consists of such silicon luminaries as Micron, Samsung, and IBM…’
.
luminaries???
‘Luminary’ can refer to a person. A corporation is not a person. Confusion as such just goes to further the injustice recently spread with the help of SCOTUS.
.
Corporations are not people. Money is not speech.

Google it.

Matthew Hunter

Well, now that you have Googled a definition today, I doubt you are useful anymore. I wish I could say your effort mattered here.

these would make stupidly powerful replacements for GDDR. at least current GPU’s are much closer to handleing the raw bandwidth of one of these than a modern CPU. that kind of bandwidth would also do amzing things for APU’s as well. forget the complex interconnect bus, route all traffic through the ram.