Riverbed claims it will de-dupe primary storage

"Hey you storage vendors, listen up. We're gonna tell you how to deduplicate primary data." That's the message given out at Riverbed's Vision Day for financial analysts on Monday.

The startling premise of Riverbed's in-development Atlas appliances is that you can strip out up to 90 per cent of the data stored in data centres by using enhanced WAN acceleration appliances to find and deduplicate data flowing between servers and storage arrays. That would be up to 95 per cent of raw backup data, Riverbed says, because it is chock full of redundant information. Broadly speaking, storage capacity goes up ten times, twenty times in the case of backup data.

What the storage array vendors say is impossible, the deduplication of primary storage array data, is what Riverbed says is practicable. It's basing this on its Steelhead appliance, which deduplicates data sent across the WAN as a way of speeding communications to branch and remote offices. This is such that they have less local IT infrastructure with the central data centre having a consolidated and thus hopefully more efficient and cost-effective set-up.

The idea is, Eric Wolford, Riverbed's marketing and business development SVP, said: "With the Atlas appliance, we are doing for data at rest what we have always done for data in motion."

Riverbed's Atlas Appliance will sit along with Steelhead appliances in front of storage arrays. The Steelhead breaks up data coming to it into byte-level patterns. The Atlas maintains an index of master data patterns and will only send new data patterns on to the arrays. So, as servers send data to the arrays it is inspected by the in-band Steelhead & Atlas appliance combination, deduplicated, and sent on - with up to 90 per cent of it removed and replaced by pointers - to the storage array where it rests.

In an added twist, Atlas can be used to inspect an array's contents and deduplicate it, reclaiming redundant capacity. Initially Alas will support Windows servers and unstructured/semi-structured data with Unix servers and structured data coming along later. The first Atlas appliance should be announced next year and will come in a redundant cluster configuration for high availability.

Riverbed is certainly thinking big, with Wolford saying. "When IT infrastructure is overloaded with redundant data, there are efficiency and cost impacts across the organization. Our vision is to eliminate these inefficiencies through removing redundant data at every point between the data center and the end user. "

It's a bold idea. Riverbed isn't saying - yet - which server-storage interfaces will be supported. We might presume that the idea is to embrace all standard storage protocol comms lines: Fibre Channel; Ethernet, and all the main protocols: block-level SAN and NAS interfaces such as CIFS and NFS.

We might presume wrong. Riverbed's statement did say: "The Atlas appliance is designed to help scale existing file storage by enabling customers' existing file servers to serve more users and deliver a larger amount of data per device." Ah, files and file servers. Not quite "removing redundant data at every point between the data center and the end user".

Never mind, these are just details. Yesterday was the big picture day with big picture benefits: Lower costs; enhanced user experience; improved manageability; scalability; greater productivity; and enhanced Riverbed revenues and profits, the presentation being to financial analysts.

Riverbed, by the way, is in a legal dispute with Quantum over its deduplication technology which, Quantum claims, infringes its patents. The stakes just got higher.

Another note: NetApp has Storage Acceleration Appliances sitting in front of its arrays now and it has its ASIS deduplication technology. Perhaps NetApp could offer the same functionality as Atlas to its customers? It is already saying that deduplication applies to much more than backup data. ®