If you are like most of the clients I deal with, you are starting to recognize the storage part of your infrastructure represents a BIG opportunity for improvement in 2013 – in agility, in efficiency, and in cost. When demand (data growth) outpaces supply (ability of hardware vendors to increase areal density driving down costs) as dramatically as it has begun to do, something has to change in the way storage infrastructure is approached in order to help balance the equation again. That ‘change’ creates a perfect economic environment for vendor innovation resulting in creative new solutions for clients. If you have been paying attention to the storage space, you’ve noticed an increased investment pace as vendors explore technical innovations and try to explain these innovations to potential clients. One of my biggest frustrations though is when the industry can’t settle on terminology for describing a solution approach leaving clients thoroughly confused and paralyzed.

Think about how long it took us to settle down on the term ‘cloud’. Most folks felt like ‘cloud’ was going to help them, but it has taken quite a while for the industry at large to understand what exactly ‘cloud’ means and how to get there. Software-defined Storage (SDS) is another of those terms that holds great promise for IT managers, but is suffering from lack of definition. ESG analyst Mark Peters recently noted in an InformationWeek article that “When new things emerge we love to give them names, but pretty soon the generic name can become an impediment to understanding the specific products and values that it’s meant to describe. Worse still, we all quickly reach a point where it’s even deemed embarrassing to profess to not understand something, so we genuinely get the blind leading the blind - or at minimum some level of blandness - or a mistaken idea that everything sporting the same name is necessarily similar”. SDS is such an important shift for clients in their quest to regain balance in the storage infrastructure that we all need to settle on terminology so clients can get on with serious evaluation of the available offerings.

Recently Ashish Nadkarni lead work by the insightful storage team at IDC to create a taxonomy for what they called Software-Based Storage. The report, IDC’s Worldwide Software-Based (Software-Defined) Storage Taxonomy, 2013 is a must read for clients who are serious about understanding the innovations taking place and how they can make use of it in their datacenters. It’s also a must-read for vendors and marketers who are trying to describe their offerings to potential clients. The report notes that “IDC believes that software-based storage will slowly but surely become a dominant part of every datacenter, either as a component of a software-defined datacenter or simply as a means to store data more efficiently and cost-effectively compared with traditional storage”. I agree – but am hopeful that settling on some definitions can help us accelerate the ‘slowly but surely’. Let’s take a quick look at IDC’s view.

Core component of a Software-defined Data Center (SDDC): I like IDC’s definition of an SDDC, “a loosely coupled set of software components that seek to virtualize and federate datacenter-wide hardware resources such as storage, compute, and network resources and eventually virtualize facilities-centric resources as well. The goal for a software-defined datacenter is to tie together these various disparate resources in the datacenter and make the datacenter available in the form of an integrated service…” Most IT managers I work with have understood the economic and agility value of shifting to virtualized compute resources and treating their server hardware as a commodity that simply provides horsepower. This is software-based compute using tools like VMware vSphere, Microsoft Hyper-V or KVM. SDS expands the thought (and benefit) to the storage part of the infrastructure treating storage hardware as a commodity that provides capacity. In a conversation I had with Ashish, he noted that “Software-based data centers will fundamentally alter IT – and how it is run… Software-based storage is a core component of this metamorphosis! The time to start planning is now.”

IDC Software Defined Data Center

Defining Software-defined Storage: According to IDC’s taxonomy, there are a set of key attributes that IT managers can look for in identifying software-based storage solutions.

First, a software-based storage solution is, well, software. It is designed to run on commodity hardware and leverage commodity persistent data resources. This is in contrast to most traditional storage systems that, while they may have software microcode at their core, depend on some custom ASIC, a specialized CPU, or a controller to perform some or all of their storage functions.

Second, in IDC’s view, a software-based storage solution offers a full suite of storage services – equivalent to traditional hardware systems. I have had clients ask “if all I get is equivalent function, what’s the point?” It’s a good question. Aside from the obvious cost benefit of being able to treat your storage hardware as a commodity, I might suggest that a ‘good’ software-based storage solution not only meets the equivalent functions of traditional physical systems, but also enables key capabilities across an entire datacenter worth of infrastructure that are simply not possible inside the confines of a physical system – things like data mobility, replication consistency, and tier management.

Third, a software-based storage solution federates physical storage capacity from multiple locations like internal disks, flash systems, other external storage systems, and soon from the cloud and cloud object platforms. In my next post, I plan to discussed one of the more powerful outcomes of this type of single software-based view on storage infrastructure.

IDC’s taxonomy goes further to flesh out distinguishing aspects of software-based storage solutions such as how data is organized (block, file, object), where data is persistently stored (internal, DAS, SAN, NAS, cloud, object), what sort of services are offered, and how the solution is packaged and delivered. In the area of packaging, I think IDC did a good job in observing that while software-based storage is software, clients don’t always have to consume it as a download or off a stack of DVD’s. Consuming SDS in an appliance form factor or as a cloud service is perfectly reasonable.

IDC Software-based Storage

As I said before, I think IDC’s report is a well-done piece of work and might just be what the industry needs to settle on definitions so we can get on with transforming the way IT managers deal with storage infrastructure.

I’ve been on the topic of software-defined storage for three posts now – one with my perspective, one covering a multi-vendor round table at Storage Networking World, and now on an intriguing bit of research.

Earlier this year, IBM sponsored EMA research into Demystifying Cloud. The project was intended to collect lessons learned from organizations of all sizes that had completed at least the first stage of their initial private cloud deployment, and then use that data to provide guidance to organizations considering the purchase of cloud technologies. Along the way, EMA discovered what most folks would not have predicted -- the critical role of storage for companies of any size and vertical when planning and implementing a private cloud.

EMA Senior Analyst Torsten Volk did a masterful job conducting the primary research and then distilling his findings into clear and understandable thoughts. The full report, linked above, is a highly recommended read. To whet your appetite, take a look at a storage focused excerpt.

The vast majority of folks Torsten spoke with in conducting his research had been directly involved in the private cloud decision process – influencers, evaluators, and the decision makers, both technical and financial. I got a kick out of Torsten’s description of the ‘cloud’ that these respondents are hoping to achieve in their projects – “…transition from a traditional infrastructure and resource-oriented approach to enterprise IT—let’s call it IT-as-a-Nuisance or Cost Center if you will—toward a business service focused definition of IT.”

The top two strategic goals identified for private cloud were probably expected – “less application downtime” and “shortened resource provisioning times”. It was number 3 and 4 on the list that grabbed my attention – “Easy storage provisioning” and “storage tiering”. Let’s think about those for a moment…

One topic that gets discussed a lot in storage circles is the negative effect that wide-spread use of server hypervisors has on the storage infrastructure – and if anything, cloud is characterized by wide-spread use of server hypervisors. Server hypervisors have been blamed for unexpected bumps in storage capacity growth and are notorious for creating very dense and non-sequential I/O patterns that result in I/O performance issues in traditional storage infrastructure. EMA’s research noted that as organizations progressed further into their private cloud deployments, the use of flash storage increased and that a remarkably large number of organizations had added capacity to deal with performance issues. They also noted, however, that “Simply throwing more spindles or Solid State Drives (SSDs) at a storage performance problem constitutes significant waste in situations where there is still plenty of storage capacity.” Enter software-defined storage and storage tiering. For some expanded thoughts, see the Flash storage ‘everywhere’ paragraph in my last post. My view is that the rise in private cloud, the wide-spread use of server hypervisors, the increase in flash deployments, and the need for a software-defined storage layer to manage storage tiering are all interconnected. IBM’s recent announcement of a $1 billion investment in Flash and the ensuing #flashahead traffic on Twitter are testament that this interconnection has created a real technical and business need that vendors are willing to sink significant effort into solving.

On the provisioning side, it’s always been interesting to me how much attention is paid to the rapid deployment of workloads on virtual server farms, and until now how little has been paid to the rapid provisioning of the storage that houses the data that these workloads require. What the EMA research uncovered is that organizations who are well along in their private cloud deployments have had the ah-ha moment for storage. A majority now see provisioning and management of storage as a bottleneck, leading to storage automation being the highest ranked integration point for private cloud – against things like OS Image Management, Server Automation, Network Automation, Application Performance Mgmt, and Single sign-on. To get a workload rapidly deployed, it needs compute resources (virtual machines), storage resources (virtual storage), and network resources (virtual networks). Clouds don’t deal in physical infrastructure, it’s entirely too rigid – or as Torsten called it, IT-as-a-Nuisance. For all the reasons discussed in my previous two posts, a software-defined storage layer is required to complete the datacenter transition to cloud. My view is corroborated by EMA’s respondents, all of whom that had reported the adoption of a hardware-independent storage hypervisor stated that their storage hypervisor software was “important,” “very important,” or “critical” to their cloud deployment.

Join the conversation! Share your point of view here. Follow me on Twitter @RonRiffe and the industry conversation under #SoftwareDefinedStorage

I’m just returning from the SNW Spring conference in Orlando. It seemed sparsely attended but my 5-foot tall wife of almost 28 years has always told me that dynamite comes in small packages (I believe her!).

As I noted in my last post, I was in Orlando to participate in a round table discussion on storage hypervisors hosted by ESG Senior Analyst Mark Peters. I was joined by Claus Mikkelsen - Chief Scientist at Hitachi Data Systems, Mark Davis – CEO of Virsto (now a VMware company), and George Teixeira – CEO of DataCore. Conspicuously missing from the conversation both at this SNW and at a similar round table held during the SNW Fall 2012 conference was any representation from EMC. More on that in a moment.

The session this time drew a crowd roughly three times the size of the Fall 2012 installment – a completely full room. And the level of audience participation in questioning the panel members further demonstrated just how much the industry conversation is accelerating. I was pleased to see that most of the discussion was focused on use cases for what was interchangeably referred to as storage virtualization, storage hypervisors, and software-defined storage. Following are a few of the use cases that were probed on.

Data migration was noted as an early and enduring use case for software-defined storage. Today’s physical disk arrays are capable of housing many TB’s of data, often from MANY simultaneous business applications. When one of these physical disk arrays has reached the end of its useful life (the lease is about to terminate), the process of emptying the data from that old disk array to a newer, more modern disk array can be consuming. The difficult part isn’t the volume of data, it’s the number of application disruptions that have to be scheduled to make the data available for moving. And if you happen to be switching physical disk array vendors, that can create related effort on each of the host machines accessing the data to ensure the correct drivers are installed. Clients we have worked with tell us the process can take months. That’s not only hard on the storage administration team, but it’s also wasteful because a) you have to bring in a new target array months ahead of time and b) both it and the source array remain only partially used during those months as the data is migrated. The economic value of solving this data migration issue is an early use case that has fueled solutions like IBM SAN Volume Controller (SVC), Hitachi Virtual Storage Platform, and DataCore SANsymphony-V. Each of these are designed to provide the basic mechanics of storage virtualization and mobility across most any physical disk array you might choose – all without disruption of any kind to the business applications that are accessing the data.

A quick side comment. While the data migration use case carries a strong economic benefit for IT managers (transparent migration from old to new disk arrays), it can just as easily be used to migrate from old to new disk array ‘vendors’. For the IT manager, this has the potential for even greater economic benefit because it creates the very real threat of competition among physical disk array vendors driving cost down and service up. But for an incumbent disk array vendor, there’s not a lot of built in motivation to introduce their client to such a technology. At SNW this week, it was suggested that this dynamic may be responsible for the relatively low awareness and deployment of storage virtualization technologies. Incumbent vendors are happy to keep their clients in the dark about software-defined storage and data migration use cases. Interestingly, almost 10 years after these technologies were first introduced, EMC (whose market share makes them the most frequent incumbent physical disk array vendor), is still only talking about this topic in the shadows of ‘small NDA sessions’. See Chuck’s Blog from earlier this week.

Flash storage ‘everywhere’ was identified as a more recent, and perhaps more powerful use case. SNW drew a strong contingent of storage industry analysts from firms like IDC, ESG, Evaluator Group, Silverton Consulting and Mesabi Group. A consistent theme from the analysts I spoke with, as well as from the panel discussion, is that data and performance hungry workloads are driving an unusually rapid adoption of flash storage. Early deployments were as simple as adding a new ‘flash’ disk type into existing physical disk arrays, but now flash is showing up ‘everywhere’ in the data path from the server on down. The frontier now is in the efficient management of this relatively expensive real estate whether it is deployed in disk arrays, in purpose-built drawers, or in servers. Flash is simply too expensive to park whole storage volumes on because a lot of what gets stored isn’t frequently accessed and would be better stored on something slower and less expensive. This is where the basic mechanics of storage virtualization and mobility from the Data migration use case come in. At IBM, we’ve evolved the original SVC capabilities to now couple the basic mechanics with analytics and automation that guide how and when to employ the mechanics most efficiently. The evolved offering, SmartCloud Virtual Storage Center, was introduced last year. Consider this scenario. You are an IT manager who has invested in two tiers of physical disk arrays. You have also added a third disk technology – a purpose-built flash drawer (perhaps an IBM TMS RamSan). You have gathered all that physical capacity and put it under the management of a software-defined storage layer like the SmartCloud Virtual Storage Center. All of your application data is stored in virtual volumes that SmartCloud Virtual Storage Center can move at-will across any of the physical disk arrays or flash storage. Knowing which ones to move, when, and where to move them is where SmartCloud Virtual Storage Center excels. Here’s an example. Let’s suppose there is a particular database-driven workload that is only active during month end processing. The analytics engine in SmartCloud Virtual Storage Center can discover this and create a pattern of sorts that has this volume living in a hybrid pool of tier-1 and flash storage during month end and on tier-2 storage the rest of the month. In preparation for month end, the volume can be transparently staged into the hybrid pool (we call it an EasyTier pool), at which point more real-time analytics take over identifying which blocks inside the database are being most accessed. Only these are actually staged into flash leaving the lesser utilized blocks on tier-1 spinning disks. Do you see the efficiency? The icing on the cake comes when all this data is compressed in real-time by the storage hypervisor. This kind of intelligent analytics – directing the mechanics of mobility – from a software-defined layer are critical to economically deploying flash.

Commoditization of physical disk capacity YYYYiiiikkkkeeeessss!!! One of the more insightful observations offered by panel members, including VMware, was that if you follow the intent of a software-defined storage layer to its conclusion, it leads to a commoditization of physical disk capacity prices. From a client perspective, this is welcomed news, and really, it’s economically required to keep storage viable. Think about it, data is already growing at a faster pace than disk vendor ability to improve areal density (the primary driver behind reduced cost), and the rate of data growth is only increasing. Intelligence, analytics, efficiency, mobility… in a software-defined storage layer will increase in value freeing IT managers to shift, in mass, toward much lower cost storage capacity.

Another quick side comment. With EMC still lurking in the shadows on this conversation and VMware agreeing with the ultimate end state, it seems the two still have some internal issues to resolve. I don’t fault them. It’s a sobering thought for any vendor who has a substantial business in physical disk capacity. But at least for the two disk vendors represented on this week’s SNW panel, we are actively engaged in helping clients achieve the necessary end goal.