Data Domain doubles up dedupe speed

Four socket DD880 is a 'Ferrari of a dedupe engine'

Data Domain has added a 4-socket quad-core processor set-up and produced a top-end DD880 box with double the performance of its old range-topper, reminding everybody why EMC and NetApp fought so hard to get it.

The DD880 has twice the performance of the previous top-of-the-range DD690 and achieves it by having a 4-socket quad-core Xeon processor engine room, instead of the DD690's 2-socket, quad-core one. Sixteen cores do the dedupe work instead of eight. The Data Domain software has had a lot of work carried out to enable the use of more cores.

The aggregate backup performance increases 100 per cent from the DD690's 2.7TB/hour to 5.4TB/hour. The DD880 has a raw capacity of 96TB, (the DD690 has 48TB,) with 71TB usable RAID-6 storage (where the DD690 has 35.3TB).

It has redundant 10GigE and 1GigE connectivity plus redundant dual-port 4Gbit/s Fibre Channel links for virtual tape library (VTL) connectivity. The VTL functionality comes as a software option, as does a Replicator - the DD880 can deduplicate globally across remote sites, with Retention Lock, and NetBackup OpenStorage support. NFS and CIFS are supported by default.

The ingest rate of 5.4TB/hour is faster than any other backup/dedupe controller product, whether they do inline deduplication, (as Data Domain does,) post-process deduplication, or no deduplication at all. It is faster than EMC's DL4406 VTL, which reaches 4TB/hour with deduplication turned off. It also is quicker than Quantum's DXI 7500 VTL, which does 4TB/hour with de-dupe turned off, and around 3.2TB/hour with post-process deduplication. Data Domain says it's using public spec sheets to get these numbers.

Up to 180 small DD120s can be connected to the 880 to send in their data from remote and branch offices.

There is a new release of Enterprise Manager, the management GUI, which provides centralised management of multiple nodes and configuration of system replication and migration capabilities.

Several backup software suppliers have added or are adding deduplication to their products - Symantec, for example - and saying that customers no longer need to buy a specialised piece of deduping hardware. Phil Turner, Data Domain's head of sales for the UK and Ireland, said: "If you're deduping in the media server, that puts an extra load on it, plus there are compromises made in the dedupe algorithm to minimise the extra load." In short, he says, you don't want to move CPU-bound middleware applications up the stack.

He classifies the addition of deduplication to backup software as a form of source deduplication. He says this is for companies that measure backup data in gigabytes, whereas target dedupe is for companies with terabytes of data.

He added: "(The DD880) just silences everything else on the market. We are the only game in town (and) we're already working on the next processor generation", meaning Nehalem.

He wouldn't comment on whether clustering of Data Domain products was on the company's roadmap, but did point out that dedupe suppliers with clustered products often limit deduplication to a subset of the nodes. For example, he said a Sepaton dedupe cluster can have up to eight nodes, but only two of them can dedupe with Tivoli Storage Manager and only four with NetBackup, quoting an HP technical note.

Turner feels that the capacity uplift with the DD880 puts to bed any idea that Data Domain's products don't scale. He didn't say "who needs clustering", but you get the idea.

Data Domain offers the non-clustered DDX Array with 16 DD880 controllers, an aggregate throughput of up to 86TB/hour, and offering up to 56PB of usable capacity. A DD880 gateway product will be available later, and it can front-end other suppliers' drive arrays.

Data Domain people are very bullish about their new Ferrari of a dedupe engine. Here's Brian Biles, Data Domain's product management VP: “It used to be that VTLs running at top speed could go faster without dedupe, storing straight to disk. This was one of the last defensible arguments for considering a post-process dedupe system architecture. That is so over. The DD880 doesn’t just change the game. It pulls the rug out from under the post process argument."

There is no attempt made by the Data Domain people to temper their remarks with regard to the Quantum-based EMC products. They were included with all the other competing deduplication products which had had the rug pulled out from under them. Read into that what you will concerning the future of the Quantum DXi technology within the EMC camp.

The DD880 will be generally available in the third quarter of this year, and an entry-level system with 22TB of usable storage is about £240,000. ®