First, I'd like to observe that VSAN is a very interesting idea, but
based on the content of this book, it's still in it's early days. In
fact, I'd even go as far as to say that things like VSAN just might be
the barbarian at the gate for certain storage use cases, as they aim to
get virtual machine storage to be highly integrated with the virtual
machine infrastructure.

Yet, as I've said, VSAN is very young. My read of the book
is that while the promise of VSAN in small and medium-sized deployments
is there, the simplicity isn't yet. Nor should anyone expect it to
be. Storage projects have a very high bar to reach, and reaching that
bar takes a long time. Especially when your goal is to run on a variety
general-purpose hardware.

What really got my attention, though, was how the authors laid out the need for VSAN in the first place. In the introduction, they say:

When talking about virtualization and
the underlying infrastructure that it runs on, one component that
always comes up in the conversation is storage. The reason is fairly
simple: In many environments, storage is a pain point. Although the
storage landscape has changed with the introduction of flash
technologies that mitigate many of the traditional storage issues, many
organizations have not yet adopted these new architectures and are still
running into the same challenges.

Hoo boy! That's quite a statement relative to the current
market-leading storage offerings. And yet, readers of this blog already
know it's true. Virtualization masks storage bottlenecks, and therefore
the easiest way to solve for this is to serve I/O from faster media. It
removes some of the need to know what the hypervisor is doing. And
while the traditional storage leaders have each bolted some flash into
their systems, generally it's been via relatively clunky (and expensive)
caching algorithms that don't solve the challenges all that well. The
best example that pops to mind is the new VNX8000, for which EMC took
the unusual (for them) step of publishing a SPECsfs2008 result. It's a fantastic result, but they built the entire back end with flash drives (over 500 of them!) to get there. Makes you think that maybe their caching isn't all that great yet.

Managing VSAN continues:

The majority of these problems stem from
the same fundamental problem: legacy architecture. The reason is that
most storage platform architectures were developed long before
virtualization existed, and virtualization changed the way these shared
storage systems were used.

In a way, you could say that virtualization forced the storage industry to look for new ways of building storage systems.

I'm smiling. One of my oft-repeated statements about Oracle ZFS
Storage is that the workloads are now coming to us. ZFS was
architectured to serve the majority of I/O from memory (you might even
call it in-memory storage). We back that memory with a large flash
cache. This leapfrogs the all-flash band-aid altogether. And in a
sense, the VSAN approach concurs with this thought.

Each disk group must include at least one flash drive and at least
one conventional disk. Moreover, the book explains that you need to be
careful to provision such that the flash drive is big and fast enough to
deliver most of the read I/O, because ordinary disks deliver 80-175
IOPs/second each, while flash drives deliver 5000-30,000 IOPs each.
This makes cache misses VERY expensive. So, the statement here is to
make sure that you have enough cache, and it's implied that the OS knows
what to do with it.

We agree. And so do all of these new guys promoting all-flash arrays
for virtualization. But all-flash means that EVERYTHING has to be
stored on flash disks. including the vast majority of data currently at
rest. This scales well, but it gets pricey pretty fast. We think we've
found a better way.

What if instead you had as much as 2 TB of DRAM in your system, and
all of the hottest read data lived THERE? Then, when the data started
to cool a bit, you had a secondary flash cache with more than 12TB per
system? And what if you'd been working on the algorithms to manage these
caches for the better part of a decade? The logic is the same as
what's being explained for VMware Virtual SANs: Fast media for hot data,
cheaper media for "cold storage".

So, bravo to VMware on the VSAN concept. And bravo to EMC for
promoting the idea that virtualized workloads need better caching. We've struggled getting customers fully engaged in our Hybrid Storage
Pool discussion, so having others out there telling the same general
story can only help. Especially since we're better at it.

Tuesday Jul 31, 2012

By 2015, archived data will exceed 150 Exabytes. If printed, that would use 8 trillion trees. More data = more challenges. But how can you consolidate it? How about backing it up, recovering and archive it? How can YOU manage storage growth?

Watch this video below to find out how the industry’s most advanced and cost-effective enterprise storage solution, delivers the best storage solution for simplifying IT. Welcome to Oracle’s Optimized Storage, welcome to the future.

About

The Oracle PartnerNetwork Strategy Blog provides executive strategic insights and updates to help OPN members learn about the latest industry trends and resources available for their business. Contributors include Oracle subject matter experts and third party advisers.