To this growing list of taxonomies, I think we're going to have to add another: pre-integrated storage vs. do-it-yourself.

And, strangely, I think that there will be certain places where this is going to be popular. But most organizations will probably never consider it seriously.

Here's why ...

So, What Prompted This?

A recent announcement by Sun around "open source storage" (???), basically offering people their open-sourced ZFS running on a Sun server, presumably a Thumper.

Now, I'm not going to comment as to the true geneology of ZFS (that's one for the lawyers), or whether Sun can be successful in this sort of business model (I'm dubious).

What I find interesting is the growing number of offerings in this relatively new category of "just add your favorite server hardware" storage: DataCore, LeftHand and probably a bunch of others I forgot.

So, what's going on here?

The Siren Song Plays Again

There seems to be a basic pitch to this newer category that I see repeated

Storage functionality? Sure, you can get a decent subset of the best that the array vendors offer via a software-only model -- I'll grant you that. And, if that subset is what you'll need for the forseeable future, check this box and move on.

Any server hardware? Yes, sort of. Most of the vendors support qualified configurations (rather than just anything) in an effort to provide a better customer experience. Sun specifically is steering people at their server hardware (duh), but I guess -- theoretically -- you have a lot of options here.

Save Money?

Here's where I turn a bit skeptical.

First, let's look at hardware costs. In our business, parts are parts. Disk drives cost pretty much the same for everyone, as do the processors and RAM we all use, and so on. Server vendors and storage vendors largely draw from the exact same parts bins, so -- based on everything I've ever seen -- there's no real cost advantage for servers in like-for-like.

One problem is getting like-for-like. As an example, if you use server-based technology to implement dual redundant controllers, RAID 5 protection and the like, you'll end up with a server that -- well -- starts costing the same as (or usually more!) than a low-end storage array.

OK, maybe you don't need all that protection and redundancy with your storage. Fine. But, before you get too excited about this approach, do a clear-eyed cost comparison around usable, protected storage, and you may be surprised as to what's the low-cost option.

The other problem is scale. With most server designs, there are only so many drives that can go behind a motherboard. Fine for entry and mid-level, but if you're talking hundreds -- or thousands -- of drives, you'll end up wasting a lot of sheet metal, power supplies, RAM, processors, etc.

Fine, don't believe me, just work the numbers for a truly large configuration, and you'll notice the effect.

Next, let's look at software costs. As an example, EMC, NetApp, HDS, IBM et. al. charge for their software, either as a specific line item, or as part of the hardware cost, as the hardware isn't really usable without the software.

People like DataCore and LeftHand charge explicitly for their software, since there's no hardware in their primary models.

And Sun has decided to give the software away. OK, I'll grant you that's pretty cheap. In the world of open-source storage software, clearly software costs are much lower, aren't they?

Now, let's talk about support costs. I'm talking things like qualifications and interoperability testing (think EMC's eLab), performance and use case testing -- all which would ideally be done before the customer gets involved. Not usually the case for open source software, is it?

And, let's not forget, we have to be very clear about customer support if and when you've got a problem. For traditional storage (e.g. EMC, HDS, IBM, NetApp, et. al.) the support model is pretty clear -- sure, we could argue who's support is better, but at least the model is pretty consistent.

The software-only vendors can control what goes on with their software, but don't have as much control with the hardware they're running on. Despite everyone's best intentions, there's an opportunity for a bit of vendor crossfire between the server vendors and the storage software vendor. I see that as a "cost" that has to be accounted for.

And the open source model? I'm not sure who you're gonna call when you're having a bad storage day, or how responsive they'll be once you get someone on the phone. Sure, there are a few organizations who have very strong technical bench strength on these topics, and are willing to invest some of their cycles making stuff work, or fixing it when it's broken.

But, from where I sit, that's more the exception than the rule.

Vendor Lock-In?

I really, really struggle with this concept, I do. Here's why:

Anything I use and get comfortable with -- well, I'm "locked in" to a certain degree. If I use a lot of storage software X; well, I'm sorta locked in, aren't I? Or, if I put my servers-as-storage on a three-year lease, I'm kind of locked in, aren't I?

All storage solutions support relatively standard interfaces and protocols. Unless you use certain advanced features, it's pretty easy (data migration aside!) to move from one to the other. And, in the storage array business, customers swap vendors all the time if they feel the need.

Now, imagine I write some custom scripting or interfaces into something like ZFS, well -- I"m locked in to a certain degree, aren't I?

It just strikes me as posturing with little -- if any -- basis in reality.

A Personal Example

I had been casting around for a home storage sharing device for a few years. I fooled around a bit with various Linux combos, even did the Microsoft thing. Way too fiddly for me, and I was spending more time making things work rather than enjoying what the platform could do.

I mentioned earlier that I got one of those LifeLine-based Intel devices, plugged it in, and got on to actually using the darn thing, rather than tinkering around.

Did I end up spending more than if I got very creative with eBay and SourceForge? Of course I did.

But I had better things to do in my spare time.

So, What Does This Mean For The Storage Industry?

We're seeing a new category forming rapidly: do-it-yourself storage. Nothing wrong with that. And I can imagine a few situations where that'd be very interesting to someone.

But, at the same time, I don't think it's going to change much out there. True cost are true costs -- hardware, software and support -- no matter how you re-arrange the buckets. Take something out of one bucket, it often ends up in another bucket. Or you end up worse off than where you started.

So, Why Am I Cringing?

Because we're probably going to hear even more nonsensical blather about this stuff in the coming months.

This particular topic is perfect for people trying to crash the party with a "new idea" (smaller vendors, certain industry pundits and curmudgeons, and so on) that really isn't a new idea at all.

If you think about it, people have been using servers as storage for more than a decade -- think Windows CIFS and NFS as simple examples. So there's nothing really new here -- customers have always had the option to press their servers into duty as shared storage devices.

Maybe they didn't like the performance, or the functionality, or the cost model, or the support model, so -- by and large -- they've been moving away from this approach for quite a while.

Which is why -- by and large -- storage arrays are so popular. They're built and designed to do a specific job, and do it well.

Comments

I think you are missing the point. I did actually a price comparison. Building your storage solution (in my case several hundred TBs) from cheap disks using x86 servers + ZFS + Open Solaris + Solaris Cluster + ZFS, where all software is not only open sourced it is also for free, *does* make a huge difference. We actually started building our solution on EMC Symmetrix (great box) and EMC Celerra years ago and endup on really cheap storage + ZFS as a replacement and a way to move forward. Additionally all features like snapshots, cloning, end-to-end checksuming, remote replication, built-in compression, built-in cryptography, NFS, CIFS, iSCSI, ... are also for free. Better - they work exactly the same regardles what cheap storage or server we put underneath.

What ZFS brings to the market is the open sourced and free Google like approach to storage - how to cheaply build reliable storage from small to large scale installations.

Sure, especially for SMB market, what is needed is an easy GUI interface built on-top of Solaris + ZFS. I'm sury you will see one sooner or later.

To me, the fact that this is Sun and not HP, IBM, Microsoft or anybody else makes this interesting. I sort of doubt they know where this is going, but they seem determined to let it play out. This seems to fit their culture and skills - although the business is questionable.

My hunch is that the support end of this will choke Sun's zeal as implementations increase. Support could come from other vendors who implement the technology in their products, but only if the licensing can work.

Chuck - Here's the simple point to Open Storage that Robert is making above: Free software + Market priced disk = Huge savings

Then add Sun Service (which services the same types of customers as EMC) PLUS 3,000 storage community members for support.

Even more, developers and Web 2.0 companies now have an entire storage software stack to use - they don't have to build their own. They can focus on developing their own software on top of a solid platform.

Chuck - depends on a solution I deployed. Quite often I do use multi-pathing using MPxIO which is delivered for free with Solaris (and yes, it even does work with EMC storage), sometimes we are talking about Sun's Thumpers (aka x4500).

When it comes to support - HW suport is easy, I just buy support from the vendor. Software support - well, you buy support for from Sun for Solaris which does cover ZFS, MPxIO, etc. and is relatively cheap.
At the same time, yes you do need specific skills and be able to support it yourself to some degree, especially with biggier environments. But it is also true if you go with traditional approach (I remember how much time it took me to come up with valiable backup approach to Symmetrix + Celerra with a lot of small files - other than replicating to another pair of Symmetrix/Cellera - I ended up with in-house and unsupported by EMC approach which actually worked well).

Now you asking about OLTP - of course, I've deployed many MySQL+Solaris x86 + cheap JBODS or arrays + ZFS - except for hardware everything else is for free and works really well.

I'm not saying that aproach is the best one in all cases - of course it is not. But in many cases it is and it is a big money saver.

Well, last time I checked there was money in helping customers build better, more affordable storage systems - at least that's what Sun is betting on ;-)

And I wouldn't be so quick to discredit Web 2.0 companies. Today's upstarts can be tomorrow's key players - a recent Forrester survey also found 1/3 of traditional companies are deploying Web 2.0 applications (in fact, we're using one).

And if you think they are not heavy IT investors, well I guess that's the whole point of open storage - they can't afford to deploy traditional storage architectures, so they need to leverage better storage economics...

Come to think of it, maybe that's a good message for traditional customers as well?

But the people I'm talking to have some serious unanswered questions around all of this, which you've declined to answer:

1 -- how is Sun's offering different than other open source storage offerings?

2 -- is there a fully delineated model regarding how support responsibilities differ in an open source approach than a traditional approach? Do I get enterprise class storage support, or I get to replace failed drives using FedEx, or do I throw myself on the mercy of the "community" when I'm up a creek without a paddle? It's not clear to anyone I talk to.

3 -- What happens when your "open source" software becomes the target of an IP suit? More specifically, where can that leave customers? And, Taylor, as you know, this isn't a hypothetical situation these days.

Best of luck -- and all credit -- for trying something new over at Sun. If nothing else, you guys are very creative.

1. Our scope - we have opened the entire stack, from drivers to higher-level apps like snapshots and mirroring. We offer high level apps like SAM (HSM) and honeycomb (Object archive), as well as COMSTAR unified target code which turns a server into a block storage device.

2. Our quality and functionality - read the testimonials from DigiTar and Nexenta about Solaris and ZFS as a storage platform.

Open source doesn't always require new support models as well. EMC's Centera ships with open source Linux on every node - you're not leaving them up the creek without a paddle, right?

Also don't discount support from a community - here is what DigiTar (a Linux shop btw) said about Open Storage in their blog - "you’ll also find an community around OpenSolaris that is by far the friendliest and most mature open source group of folks you’ve ever dealt with."

SunSpectrum customers can find support for OpenSolaris and customers can buy RTUs of open source projects if they desire traditional support - the market has worked out service for open systems and open source. It's a long subject - I'll blog more on this later, but this is a great topic and a great place to differentiate as well.

I'm glad you asked about IP suites as well. Sun indemnifies customers who use Sun open source. This is unique. IP suits hit traditional software as well keep in mind. What happens to EMC customers when EMC is the target of an IP lawsuit? Do you indemnify them?

"From Federal Express to Verisign, SAP and Oracle to Siebel, Veritas and BEA - from across the globe and marketplace - there is tremendous demand and support. They love that we're open sourcing Solaris, and that we'll be the first open source vendor to offer a commercial version of our product with indemnification against intellectual property lawsuits."

Since the key distinction that you're drawing here is essentially an economic one (that is, you don't seem to be making a technical argument against ZFS), what makes you think that Sun can't take these components and integrate and support them? And even if you discount Sun (for lots of good reasons, I might add), what is to stop a start-up from doing it?

Have to agree with Chuck here -- I can't see how you are going to achieve 5 or even 3 nines reliability unless you perform full DVT testing (and bug fixing) against a revision controlled HW/SW platform. This means locking down the HW rev of your logic board, all instances of microcode such as what runs inside the Fibre HBA chips as well as BIOS, every device driver, and even revisions of disk drives and their firmware. You are part of the way there if you select an off-the-shelf server from a tier-1 player, but even they are pretty lax when it comes to testing hard-core Fibre Channel applications. My company operates in both worlds -- we make RAID hardware but we also have products based on commodity servers. From my experience, it's very challenging from an operational standpoint to acquire suitable commodity servers and keep the vendors from changing anything, but luckily we have enough experience to know the right questions to ask and have the right tools to perform enough testing. To give a sense of scale to this issue, more than half the commodity components we test fail to achieve our standards of reliability, and we only test stuff from Tier-1 suppliers.

Excellent point, Bryan -- there is absolutely nothing preventing anyone (including Sun) from assembling the components, testing and delivering them in a highly supported configuration for customers who might want a more traditional approach to how they consume storage.

Ditto for someone doing the same with Windows, or Linux, or any other commodity-software-stack-meets-commodity-server-hardware combination.

I do happen to think ZFS is cool. And I do think there's a potential of a business model to sell it and deliver it in a more traditional way, e.g. as a platform product rather than a set of do-it-yourself components.

In general though, the constraints you mention are somewhat less demanding for NAS environments (TCP/IP can be a very patient protocol, unlike FC), but the point is still valid.

In my mind, the only potential market is people who (a) are looking for 1 or 2 "nines", and can't justify higher availability, and (b) are willing to do a bit of roll-your-own before, during and after the deployment.

This is the same 'ol argument that we have over open source time and again now extended to storage. It boils down to risk. How many people are willing to risk running their mission critical ERP software on Linux, for example? Some, yes, but not many. What most folks do is run their mission critical apps on AIX, HPUX, Windows, etc. where they can get support. Where Linux is making inroads is on other applications that people are willing to take a small risk on in order to save money.

What I question in nearly every case is what are the real savings with open source? Sure, you don't have to pay any license fees for your software, but are you ending up spending more on internal support costs? I've done the math on a couple of occasions, and when you consider the 3 year TCO of open source, you rarely end up coming out ahead.

So I suspect that open source storage will end up the same way. Those who want to save every nickle they can on the front end will go open source and pay the ongoing internal support costs, those who are risk adverse will pay for the relative "safety" of going with a major storage vendor and getting a partner who will help them through the tough times. I think that what we will see is open source storage taking the same slow path into the data center that Linux did and in the end have about the same, or maybe a little less of a share as Linux.

So here's an interesting question, if it's all about cheap storage, what about products like EMC's Hulk which I beieve is nothing more than commodity components bundled together and sold dirt cheap? Does something like that address the those among us who are more concerned with upfront costs?

I noticed your comment of : "I do happen to think ZFS is cool. And I do think there's a potential of a business model to sell it and deliver it in a more traditional way, e.g. as a platform product rather than a set of do-it-yourself components.I don't know if Sun's up for that, though ..."

You've bought up a key point here and it's something that's certainly been acknowledged and addressed within Sun.

Yes there's a strong belief that the Open Storage strategy can bring a great deal of benefits to developers and enterprise customers alike, however you've correctly pointed out that enterprise customers view support and risk differently to developers and that needs careful addressing and handling.

You are correct in highlighting the "DIY" model where customers can build storage solutions from open source and industry standard components, however Sun are also offering "de-risked" solutions that use these components to address more business focussed needs.

Examples of this are the Greenplum Data Warehouse solution, or the Sun Secure Data Retrieval Server (SDRS) box. The SDRS solution takes the JBOD approach (through the X4500 "Thumper" box) and uses a partially open-source software stack to provide call data records management for telecommunications customers. By utilising industry standard hardware and open source software, Sun can provide a much more cost effective solution than say Centera & Sensage but without additional risk to customers. Oh and it's a great deal more energy efficient too ;-)

I think that whilst some of your arguments are sound, and that it will take some time for the enterprise customer and open source markets to fully come together without compromise, Sun can address the best of both worlds by providing more "appliance" type approaches that use open source software as the foundation for solutions.

The move from JBOD to intelligent arrays was the right thing to do in the late 90's due to technology constraints, however are the reasons for those move still technology challenges? Don't get me wrong, I don't think people should swap their Symmetrix systems for JBOD and Open Source software today, however the market dynamics are changing and it will be interesting to see how this plays out over time.

Consider implementing Solaris as a storage stack on one of your products. You can put it on top of commodity hardware, put your value add into the mix (EMC support, EMC management apps) and then come talk to the OS folks at Sun. Heck, with COMSTAR (http://opensolaris.org/os/projects/comstar) you can download and build a multi-protocol box that will just work on an your systems. I am sure we would be happy to help you guys and you could cut costs. Give me a shout if you want to have a chat about how we can work together to enhance storage for both our customer sets.