The world of storage is changing – fast. When I started consulting, I used to install clusters and super-computers. My specialty was IBM’s SP super computer (like Deep Blue the one that played the Russian chess champion Garry Kasparov). My wife asked if they wore capes.

The super computer market fell apart over a decade ago to grid-based systems. What used to cost millions of dollars was swept away by inexpensive commodity Intel-based servers, usually running Linux and grid software. The market changed and super computers started to become extinct. What used to be a scale-up model became a modular massively-parallel model which became a highly distributed model. The market changed.

When you go to deploy an Internet facing app today, you are not building one large web server, but many web servers, backed by application servers backed by database servers. A load balancer distributes the incoming traffic amongst the various servers to maximize throughput and minimize response time. This again is a scale out, not scale up model.

This is the basis of the grid (or farm if you prefer). Many distributed nodes are balanced, distributed and protected so there is no single point of failure. Server and desktop virtualization is protected N+1 and can easily scale. We repeat this model throughout the enterprise.

Storage is no different.

EMC has: Avamar, Isilon, Atmos and Centerra; HP: Lefthand and 3Par; IBM: XIV and SONAS; NetApp: Cluster-mode. There are others like Hadoop which distributes the data into different function nodes. There are products that go half-way, such as EMC VMAX and IBM SVC; these effectively partition the data amongst redundant controllers but the data and throughput is siloed without being completely distributed.

Why Grids

Why grids: mainly because we need them. Imagine storage that scales with you. Instead of head swapping to a larger controller, you simply add more nodes. You now have a scalable system not just at the spindle-level, but also at the controller level.

As solid-state/flash disk becomes more prevalent in the data center, the bottleneck moves from the spindles to the controllers. As storage efficiencies such as compression, deduplication, thin provisioning, WAFL, etc. continue to grow in use, this puts more strain on the controller and it’s processing power. Snapshots, mirroring, replication and NAS all add overhead. The result is the controller, which has long been over-powered, is starting to strain.

We’ve been lucky. For most of my customers, as throughput has grown, so has capacity and spindle count. We’ve mostly been keeping pace. FlashCache, FAST Cache, auto-tiering and SSD have helped control this, but we’re getting close to the breaking point. Storage efficiency software combined with low-latency solid state is pushing controllers to the brink.

We’re growing beyond a dual-controller solution. When IBM came out with real-time compression, they stated you need a lot of free CPU on the v7000/SVC. If EMC or IBM adds dedupe, we’ll see what they do to controller loads. NetApp cluster mode arrived just in time for two of my larger customers. Whatever the cause, we’re running out of bandwidth within the controllers.

This isn’t a terrible worry. We’ve been moving toward grids for some time. We have to. Without them, we’ll eventually run out of gas. Today, I can architect any solution, but I may have to partition the data: multiple controllers, multiple SVC nodes, or multiple VMAX engines. It’s not the best solution; we end up with silos of data. Where do we want to go: a single-managed intelligent grid. One set of data that self-balances, self-tiers, distributes itself elegantly, without external add-on tools.

This is where we’re heading. It will be a better world. We will be able to start small and grow performance and capacity like we are used to in the virtualization cluster, the web farm, the VDI farm. We’ll grow our storage the same way. We’ll be much better for it.

Useful Links

Subscribe

Follow by Email

Followers

ShareThis

Disclaimers:

I am often privy to confidential and non-disclosure information, none of that should appear here. I take non-disclosure agreements (NDA) seriously. I will never talk about any product that has not yet been announced, unless it is publicly disclosed by the vendor. My intentions are to let you know how to use announced products.

I am not actively moderating comments, but reserve the right to pull comments that break any of the rules on copyright, confidentiality, NDA information or disparaging remarks.

The postings on this site are entirely my own and don’t necessarily represent my employer's positions, strategies or opinions.