Intel Optane Memory 32GB Review - Faster Than Lightning

Introduction, Specifications, and Requirements

Introduction:

Finally! Optane Memory sitting in our lab! Sure, it’s not the mighty P4800X we remotely tested over the past month, but this is right here, sitting on my desk. It’s shipping, too, meaning it could be sitting on your desk (or more importantly, in your PC) in just a matter of days.

The big deal about Optane is that it uses XPoint Memory, which has fast-as-lightning (faster, actually) response times of less than 10 microseconds. Compare this to the fastest modern NAND flash at ~90 microseconds, and the differences are going to add up fast. What’s wonderful about these response times is that they still hold true even when scaling an Optane product all the way down to just one or two dies of storage capacity. When you consider that managing fewer dies means less work for the controller, we can see latencies fall even further in some cases (as we will see later).

No real surprises here. We had these specs a bit earlier but things appear to have been de-rated a bit, mostly in the interest of keeping power consumption at lower levels from the looks of it.

Requirements:

Optane Memory requires Intel 7th generation hardware. We suspect you might be able to get away with an older CPU, but the motherboard is a must since it requires BIOS-level support to properly boot a system overlaying Optane Memory over other storage devices.

Most of you may not remember this, but 8 years ago I reviewed the DDRdrive. It consisted of 4GB of DRAM on a PCB. Despite being connected to the system via a single PCIe 2.0 x1 lane, its QD=1 performance exceeding 100,000 IOPS trounced the competition (and still does today actually). While the limited capacity meant it was useless to store your OS, the product worked itself into many enterprise applications as a very fast storage cache device. The limited capacity, price, and lack of an ability to easily use it as a storage cache prevented this sort of thing from catching on for consumer PC's, but one could dream (back then). Now it is basically a reality, as Optane Memory can actually surpass the DDRdrive! Only took 8 years for that particular pipe dream to come true.

Allyn, I considered "borrowing" your idea of having a 96GB RAID-0 array for OS, and 1TB 960 EVO for apps/games/data. But I have a dilemma: That combination would require 4 M.2 slots, but 3 slots are the most that I find available on Z270 motherboards. Any suggestion?

The only solution to the lack of M.2 slots would be a riser card plugged into the PCIe x16 slot. I "think" PC Perspective just reviewed a 4 or 8 M.2 card on the last podcast. Would the single 960 EVO go into the slot, or the RAID'd Optane drives? Thank you for reading and potentially responding to my comments.

I was trying to keep the focus on the read performance there. The write results appear inverted because NAND SSDs handle burst writes faster than they handle burst reads, but the important factor determining application launch performance falls on the random reads.

I'm sorry to say it, but for desktop, Intel probably could care less about <2% of the market. It is a standard NVMe part though, so feel free to pick one up and code your own caching driver (I hope it fares better than BCache). Coding your own additions is a big benefit of Linux, after all.

Well they have one of the biggest open source team, they do care about linux. But fair enough, not necessary Desktop.

Michael at phoronix got the confirmation that it should work fine as a standard nvme storage. Heck I think that just means you could install whole linux system on it or mount part of system directories on there(Like replacing quite common speed tweak on linux: mounting /var/tmp on the main memory with tmpfs).

Caching an NVMe SSD with another NVMe SSD gets to the point where overhead becomes an issue, mostly because you are still bottlenecked by DMI throughputs at the end of the day. I'm sure there would be a benefit in some cases, but we were getting to the point of diminishing returns with Optane caching a SATA SSD. You're still limited by NAND flash latencies on a 960 PRO, so QD1 performance while cached by Optane should end up similar to when caching an 850 series (SATA) part.

I also wouldn't recommend that specific configuration because I doubt Intel supports it, so you may run into odd compatibility issues.

The 16GB will be even lower latency. The 32GB is rated at 9/30us while 16GB is at 7/18us.

I think the reason they gave 32GB to reviewers is because the sequential write is halved to 145MB/s and it might garner unfavorable reviews.

The difference is interesting. Also the lower latency compared to P4800X. Its likely the overhead of putting multiple channels. The throughput suggests 16GB has 1-channel controller while 32GB has 2. P4800X has 7. The 16GB version thus has less overhead exposing more of media level performance.

1 channel for 16GB and 2 channels for 32GB is most certainly the case considering a single XPoint die is 16GB of capacity :). 4KB random reads on the P4800X would still be only hitting one die at a time. Lower latencies at smaller capacities are likely down to simpler controller design enabled by the fewer channels / less addressable capacity.

Possibly, but I know that there have been issues with the Linux kernel and NVMe, if your distro is happy with NVMe it can physically work. Convincing the OS to use it as a cache might be a different story.

I feel your pain. Trying to get 9 plot lines on the same chart, where there were three 'brackets' of three results each, well, that's as good as I could make it look. Good idea on the legend thing. I'll try and implement a form of that in the next review using these charts, as well as trying to get more creative with the dash styles to differentiate plot lines even further.

The Optane benefits are so far ahead of HDD, and bring the results so close to that of a SATA SSD, it will very much look the same as it did when caching the HDD, with the occasional burst of how it looks when caching the SATA SSD (very similar in most timed tests).

On the subject of typos...
The second graph in the Client Performance section specifies 'ud' as the latency measurement units, when it's clearly supposed to be 'us' and while you're at it, you could even fire up the character map and spoil us with a μ instead of a 'u'.

On the subject of the article...
1) Fantastic work!
2) It's a bummer Optane requires Kaby Lake to run. If it didn't, I'd be enthusiastically badgering my boss to get a handful of the 32GB drives for our web servers to store databases on.

1) Thanks!
2) It only requires Kaby Lake to function as an storage cache. It's a standard NVMe SSD otherwise. You can even chipset RAID up to three of them if you can find the appropriate Z170 board, but again, no caching.

That could have perhaps been made clearer then.
Every piece of news coverage of Optane I've seen has warned about Optane needing the latest Intel platform, without mentioning that that's only the case if you want to be using it with Intel's caching software.