NetApp plays out ASIS dedupe lead

Comment While EMC, Dell, HDS and HP stand impotently by, NetApp is making a killing in virtual desktop infrastructure deals and extending its lead in primary data deduplication, making ASIS run faster and deal with more data. How long can this advantage last?

Last week NetApp announced terrific quarterly results, citing virtualisation as one area where it's selling lots of product. So far NetApp is the only major array vendor offering primary data deduplication. Compellent is working to add deduplication to Storage Center with Replays as the prime target. GreenBytes and Coraid offer it too, Coraid courtesy of ZFS, and GreenBytes with its own technology. Nimbus offers it because its array is an all-flash product.

Our understanding is that NetApp is making a killing in selling its FAS arrays into the server virtualisation and Virtual Desktop Infrastructure (VDI) because its ASIS deduplication makes its storage much more efficient, storing more bytes in less disk.

A NetApp-aware contact said the Flash Cache, formerly known as as PAM (Performance Accelerator Module), which fits into a FAS controller, is dedupe-aware. Common block reads from, for example, VDI implementations are satisfied from cache regardless of the VMDK that they live in. He said: "Coping with bootstorms is one of the real benefits of deduplication and dedupe-aware cache."

WAFL, NetApp's file system, turns random writes into sequential writes and you can then drive SATA hard drives at phenomenal rates. When doing sequential writes with big stripes the effective IOPS rate is huge. It's similar to the I/O patterns and performance one would see in an audio-visual streaming scenario.

NetApp has no intention of adding write caching, as its testing has demonstrated a lot of overhead and no real-world benefit. Sharing out the Flash Cache between reads and writes would make the read portion smaller and hence reduce its benefit.

It appears that NetApp has rejected the idea of doing inline deduplication. The ASIS dedupe process only has to work on changed blocks, and change rates are generally quite low, being less than 10 per cent for the majority of data types, and normally lower than that for virtual machines (VMs), home directories etc. ASIS activity is scheduled during quiet periods, so it's not that intrusive, and may be much less intrusive than continuous inline deduplication.

Our contact thought that EMC's Data Domain Boost design is a case in point where inline forces compromises and costs in the shape of significant extra horsepower.

NetApp has been refining its fairly conservative dedupe volume sizes that it started with at launch. It gets information automatically from the AutoSupport feature on its customers' systems and analysis of this has enabled it to raise them considerably. They will be lifted to higher levels again with the forthcoming release of ONTAP, NetApp's O/S for its arrays.

It has also helped with developing new use cases for dedupe, and may even help with compression, should NetApp introduce it. Unless its competitors respond, NetApp will continue creaming off VDI business, offering storage-efficient dedupe plus performance-enhancing flash cache, which beats their less efficient offerings. ®