Memory 2.0 -- The Persistent Memory Era

Chief Scientist at Cloudistics. IBM Fellow Emeritus. Previously served as Global CTO for multibillion dollar businesses at IBM and Dell

Shutterstock

As DRAM approaches scaling limits, there is significant industry investment in alternatives. An approach called persistent memory (PM) is emerging that is likely to influence enterprises in important ways. This article describes PM, how applications will benefit and why customers should care.

Persistent memory is byte-addressable, non-volatile memory that has performance close to DRAM. It is expected to be cheaper than DRAM. PM latencies will be significantly lower (nanoseconds) than Flash (microseconds) or disk (milliseconds). Applications will be able to access PM directly by using load-and-store instructions, as we describe below.

2. How Is PM Attached?

PM can be directly attached on a memory interface like DDR4, or it can be attached using new emerging memory fabrics like Gen-Z, CCIX or OpenCAPI. Gen-Z and CCIX fabrics support switching, allowing multiple servers to share PM. The Gen-Z 1.0 specification was only published this past February, and chips are not expected until late 2019.

3. OS, File System And Software Support For PM

SNIA TWG has proposed a persistent memory programming model. PM support is available in Linux and Windows. PM is supported in two modes -- a direct access mode called DAX, and a block access mode. DAX mode is faster, and it allows for single instruction access.

To insulate programmers from the complexities of accessing PM, a PM development kit (PMDK) is available in both Linux and Windows.

Microsoft and VMware have already modified several of their products to support PM. Microsoft products with PM support include SQL Server 2016, Storage Spaces, SMB3, and Hyper-V. VMware Vsphere also has PM support.

4. Memory Technologies For Building PM

There are two classes of technologies that can be used to build PMs. The first class, which we call DRAM replacement PMs, has the potential to replace DRAM. Everspin STT-MRAM and Nantero NRAM are examples of such technologies.

The second class has the potential to fill the gap between DRAM and Flash. Our focus for this article is on these gap-filler PMs. They are cheaper and denser than DRAM and allow more memory to be attached to a server than what is possible today. Gap-filler PM technologies include Intel 3DXP/Optane (the front-runner) and resistive RAM (e.g., Crossbar ReRAM).

5. Use Cases For PM

We see the use of PMs by enterprise applications rolling out in three waves.

• Wave 1:The application wants more memory -- non-volatility not needed. These are applications that are memory starved or can benefit significantly from additional memory. With PM, the application effectively gets to use more memory at the same cost. Since these applications are not making use of the non-volatility, the applications work unmodified.

Consider the Aerospike database system, which has been experimenting with PM. They store the database on Flash and indices (3-6% of database size) in RAM. PM could allow Aerospike to store four times as many indices for the same price, enabling support for four-fold larger databases.

• Wave 2:The application uses PM with persistency instead of DRAM. Here, the app will need to be modified to use the persistence property of PM.

In May 2018, Intel compared a Cassandra database running on 256GB of RAM + 1TB PM vs. on 1TB of DRAM. With PM, it eliminated write-backs to storage and got nine-times more read transactions and supported 11-times more users.

A second example demonstrates recovery time improvements. Intel demonstrated restarting a server running Aerospike’s database on a server with Optane in 16.9 seconds, versus 35 minutes on a server with DRAM. With PM, there was no need to recreate the state of memory following a failure.

• Wave 3: The application uses PM instead of Flash. This allows for faster performance at a higher cost point.

There is a range of options for what to move from Flash to PM. An extreme case is to move the entire database. An intermediate case is to move an active subset of the database or just the transaction logs.

IBM Research prototyped MongoDB using PM instead of SSDs for storing the entire database and found it ran 2.8 times faster. This improvement is less impressive than expected. While the IBM team modified the storage engine in MongoDB to use PM, they did not modify the rest of MongoDB. The storage latencies were reduced significantly, but much of the total latency is caused by code outside the storage engine, and this did not improve, causing the observed result.

6. Summary

High-performance, byte-addressable persistent memory will have a significant influence on enterprises, starting from 2019. The front-runner technology for PMs is Intel 3DXP/Optane. It will be cheaper than DRAM and almost as fast. Software support in Windows and Linux for PM is already available.

Use of PMs by enterprise applications will roll out in three separate waves. The first wave, starting in 2019, will exploit the fact that PMs are cheaper and denser than DRAM. No app modification will be needed, as they will not exploit the persistence property. Memory-intensive applications such as fine-grained simulations, bioinformatics, data science, real-time BI and financial analytics will benefit.

A second wave of applications will replace DRAM with PMs and exploit its persistence (non-volatility) property. This will require modifications to existing applications. As a result, these will likely not roll out until 2020.

Finally, there will be a third wave of apps that will use PMs instead of SSDs. Early results indicate that simply replacing SSDs with PMs does not lead to large performance improvements, as more work is needed to reap larger benefits.