NVMe

Roughly 6-7 years ago (around 2012), flash storage became affordable as a performance tier. At least, for the companies I was visiting. It was the typical “flash tier” story: buy 1-2% of flash capacity to speed everything up. All-flash storage systems were still far away into the future for them. They existed, and they were incredibly fast, but they also drove the €/GB price too far up, out of their reach.

However, in the background you could already hear the drums: it is going to be an all-flash future! Not just for performance, but also for capacity/archive storage. In fact, one of those people beating that drum was my colleague Rob. I recall vividly our “not yet!”-discussions…

And it makes sense. Solid-state drives are:

More reliable: there are no moving parts in SSDs, and media failures are easier to correct with software/design.

Power consumption is very low at rest: there is no little motor to keep platters spinning 24/7.

Faster: the number of heads and the rotational speed of the platters limit a hard drive’s performance. Not so with flash!

They are still quite expensive, looking at €/TB. Fortunately, cost is coming down too. The last year or two, all flash arrays have taken flight in general-purpose workloads. Personally, I have not installed a traditional tiered SAN storage system in over a year anymore. Hyper-converged infrastructure: same story, all flash. The development of newer, cheaper types of QLC flash only helps close the gap in €/GB between HDD and SSD. But there is still a 20x gap. And one company we met at Storage Field Day 18 has a pretty solid plan to bridge that gap: VAST Data.

It is a fact of IT life: hardware becomes faster and more powerful with every new generation on the market. That absolutely applies to CPUs. A few weeks ago at Intel’s Data Centric Innovation Day in San Francisco, Intel presented their new Intel Xeon scalable processors. These beasts now scale up to 56 cores per socket, with up to 8 sockets per system/motherboard. This incredible amount of compute power enables applications to “do things”, whether it’s analytics, machine learning, or running cloud applications.

One thing in common across all applications is that they don’t want to wait for data. As soon as your %iowait is going up, you are wasting your precious and expensive compute power because the storage subsystem is not fast enough. Fortunately, WekaIO wants to make sure this will not be the case for your applications.

A few weeks ago I received a 1TB Western Digital Black SN750 M.2 SSD, boasting an impressive 3470 MB/s read speed on the packaging. I already had a SATA SSD installed in my gaming/photo editing PC. Nevertheless, those specs got me to pick up a screwdriver and install the new M.2 SSD. The physical installation is dead simple: remove graphics card, install M.2 SSD, reinstall graphics card. I wasn’t really looking forward to a full reinstallation of Windows 10 though. There’s just too many applications, settings and licenses on that system that I didn’t want to recreate or re-enter. Instead, I wanted to clone Windows 10 from SATA SSD to M.2 SSD.

After a little bit of research, I ended up with Macrium Reflect, which is freeware disk cloning software. Long story short: I cloned the old SSD to the M.2 SSD, rebooted from the M.2 SSD, and… was greeted with a variety of errors. The main recurring error was Inaccessible Boot Device, however in my troubleshooting attempts I saw many more errors.

The first SSDs in our storage arrays were advertised with 2500-3500 IOps per drive. Much quicker than spinning drives, looking at the recommended 140 IOps for a 10k SAS drive. But it was in fact still easy to overload a set of SSDs and reach its max throughput, especially when they were used in a (undersized) caching tier.

A year or so later, when you started adding more flash to a system, the collective “Oomph!” of the Flash drives would overload other components in the storage system. Systems were designed based on spinning media so with the suddenly faster media, busses and CPUs were hammered and couldn’t keep up.

Queue all sorts of creative ways to avoid this bottleneck: faster CPUs, upgrades from FC to multi-lane SAS. Or bigger architectural changes, such as offloading to IO caching cards in the servers themselves (e.g. Fusion-io cards), scale-out systems, etc.

Dr. J. Metz talked with us about NVMe at Storage Field Day 16 in Boston. NVMe is rapidly becoming one of the new hypes in the storage infrastructure market. A few years ago, everything was cloud. Vendors now go out of their way to mention their array contains NVMe storage, or is at the very least ready for it. So should you care? And if so, why?

SNIA’s mission is to lead the storage industry worldwide in developing and promoting vendor-neutral architectures, standards and educational services that facilitate the efficient management, movement, and security of information. They do that in a number of ways: standards development and adoption for one, but also through interoperability testing (a.k.a. plugfest). They aim to help in technology acceleration and promotion: solving current problems with new technologies. So NVMe-oF fits this mission well: it’s a relatively new technology, and it can solve some of the queuing problems we’re seeing in storage nowadays. Let’s dive in!

Excelero Storage launched their NVMesh product back in March 2017 at Storage Field Day 12. NVMesh is a software defined storage solution using commodity servers and NVMe devices. Using NVMesh and the Excelero RDDA protocol, we saw some mind blowing performance numbers, both in raw IOps and in latency, while keeping hardware and licensing costs low.

Once upon a time there was a data center filled with racks of physical servers. Thanks to hypervisors such as VMware ESX it was possible to virtualize these systems and run them as virtual machines, using less hardware. This had a lot of advantages in terms of compute efficiency, ease of management and deployment/DR agility.

To enable many of the hypervisor features such as VMotion, HA and DRS, the data of the virtual machine had to be located on a shared storage system. This had an extra benefit: it’s easier to hand out pieces of a big pool of shared storage, than to predict capacity requirements for 100’s of individual servers. Some servers might need a lot of capacity (file servers), some might need just enough for an OS and maybe a web server application. This meant that the move to centralized storage was also beneficial from a capacity allocation perspective.