Pernixdata – The solution to the storage IO bottleneck?

With the recent news that the one and only Frank Denneman – VMware’s Senior Architect Technical Marketing – is leaving the company for the relatively unknown start-up “PernixData” it’s bound to raise some eyebrows and get people talking. It certainly caused me to take a closer look at the company and their product. I’ll try to summarize here what it is aiming to accomplish (to the best of my knowledge), and why it is so exciting for us virtualization geeks.

Arguably one of the most challenging pieces to get correct in any virtualization design architecture is shared storage. You primarily need to plan for two things: capacity and performance. Capacity is relatively easy, you just need enough space to put stuff. Performance is not so simple. There are many things that come into play. You have to be keenly aware of several factors, including the anticipated read/write workload profile, the limitations of your chosen storage protocol (ISCSI, FCoIP, FC, NFS), the hypervisor-level features you will be employing, the overall performance profile of the array, and of course the spindle speed and configuration of the RAID on the back-end.

All of these things need to come together in way that will ensure that your design will meet the functional requirements for the storage environment. It’s never an “easy” answer, and there is no “one-size-fits-all” solution (regardless of what your storage vendor may tell you).

Not too long ago, designing for performance was fairly straight forward. If you give me a projected read/write profile and storage requirement with projected growth rate, I could compute the necessary disk type, disk count, raid type, and protocol to support that workload with a fairly high degree of accuracy. Nowadays virtually all storage vendors are front-loading their volumes with an ever-increasing amount of read and write cache in the form of both RAM and SSDs. This is a good thing. It greatly increases the IO capacity of the volume. But it certainly makes the water muddy for an architect. It is not easy to be able to tell a customer exactly how many read and write IO operations that the data-store will be able to handle.

The problem is that the cache performance is of course astronomical however all of those write IOs are going to have to be evacuated to disk at some point. Cache-heavy filers are banking on the concept that most all workloads go through periods of peaks and valleys. And they work great as long as you have more “valley” time (to flush the write operations to disk) than peak time. The problem is that if you manage to fill up that cache you will then be restricted by the speed of the supporting RAID backend. And that will be a pretty substantial speed bump. Like going from a Ferrari to a Prius, or maybe IOS to Android (zing!)

When you are designing an infrastructure with a cache-heavy filer, the conversation becomes less about what the actual IO capacity will be and more about what the theoretical IO capacity could be. As an architect, we despise this kind of ambiguity.

Enter PernixData. Their goal is to basically take the performance piece out of the loop and allow you to rely on the filer for only capacity. They do this by using flash virtualization. Essentially you install a certain amount of flash at the hypervisor level. The PernixData solution then runs as a kernel-level hypervisor module that allows the cluster to share the flash as a single aggregated pool. The module will intercept the IO stream of the VM and make the determination if that operation needs to go to the SAN or to the local cache.

This is huge because it allows you to simply scale out the cluster to increase the amount of cache capacity. It also means you do not have to replace anything in your existing environment. You simply add the flash and kernel modules to your hosts and bada-bing you’ve just astronomically increased the IO capacity of your datastores.

The application provides its “clustered” capability by using the vMotion network to replicate the cached blocks. When a VM is migrated to another host, it no longer has access to it’s locally cached data footprint. But instead of going back to the storage array for those blocks, the solution will pull the blocks over from the previous host (assuming it is not down), and populate the local cache on the new host. This is obviously a much more efficient way to “warm up” the local cache than going all the way back to the SAN which will have more network hops in the path (assuming IP based storage). When pulling the blocks off of a host sitting on the same switch fabric/stack (especially when running at 10Gb) you are probably talking nanoseconds in terms of latency.

The solution is now only working with the VMware hypervisor, however Microsoft’s Hyper-V and KVM are in the plan as well. Also, it is only supporting block-based protocols. NFS is on the roadmap, but not yet supported. The concept is very exciting. We’ll have to see if they can pull it off as advertised.

The company certainly doesn’t suffer from a lack of experience or brain power. Not only do they now have Frank Denneman on hand, but the company was co-founded by Satyam Vaghani who has over 50+ patents to his name. Oh yeah, he also wrote VMFS.

I highly recommend you take the time to view the videos from the PernixData SFD3 Presentation for a full demo:

PernixData Introduction:

Product Overview:

Product Demonstration:

Technology Deep Dive:

As you can clearly see, this product has the potential to be a game changer.