Related topics

Symantec leak hints at PureDisk on NetBackup 5000

Secret dedupe tech

Common Topics

Symantec appears to be close to introducing a NetBackup 5000 appliance running PureDisk deduplication software, if a leaked manual is real.

The Symantec NetBackup 5000 Getting Started Guide looks like a standard Symantec document. The Register has seen v6.6.0.2 revision 2, with a 2010 date. It describes a NetBackup 5000 appliance that runs PureDisk and is called a PureDisk node.

One or more NetBackup 5000 appliances exist in a storage pool. The appliances run PureDisk services which consist of one storage pool authority, a content router to store file content, a metabase engine to store file metadata, a metabase server which manages metabase engine queries, and a NetBackup export engine to send information to a NetBackup environment.

The storage pool authority runs on just one storage node. The other services can run on one or more storage nodes. What is a metabase?

The manual states:

When PureDisk performs a backup, it separates the file content from its metadata. PureDisk uses global deduplication technology to reduce the amount of backup data that it stores. It writes file content to secondary disk storage, and it writes file metadata to a distributed database that is called a metabase. The metadata consists of information about the file such as its owner, where it resides on a client, when it was created, and other information. The metadata also includes a unique fingerprint that identifies the file’s content to PureDisk.

When you restore files, you restore the files or directories that you need. You do not need to restore an entire data selection.

The NetBackup 5000 can be used to reduce NetBackup-stored data in a data centre, by reduplicating that data through enabling the PureDisk Deduplication Option (PDDO). This enables a NetBackup media sever to send data to a PureDisk storage pool for deduplication. A PureDisk storage pool can be based on the NetBackup 5000 appliance, on hardware the customer selects or on a virtual machine.

This tells us that the NetBackup 5000 is most probably an Intel X86 server with directly-accessible disks and a link of some kind to other NetBackup 5000s for the global reduplication. It is rack-installable. The manual states that "Each PureDisk node includes hard disks. The PureDisk operating system (PDOS) resides on a mirrored disk RAID set. Upon the other disks is a storage directory, and PureDisk writes all your backups to this storage directory. The PureDisk application is preinstalled on the appliance."

The appliance is configured using a notebook computer running Windows, and Internet Explorer 6 or 7 or Firefox browsers. It connects to the appliance by Ethernet and the appliance has a pre-installed Ethernet NIC (network interface card). The appliance appears to run a Unix-type operating system and is not fitted with a keyboard and monitor.

The appliance nodes in a storage pool appear not to support clustering or high-availability features judging by this statement from the manual: "On a NetBackup 5000 appliance, PureDisk does not support high availability. Information in the PureDisk documentation that refers to clustering or high availability does not apply to storage pools configured from one or more NetBackup 5000 appliances."

We would speculate that this product is a Symantec response to Data Domain and that Symantec thinks that selling its deduplication software via an OEM deal with Dell is no longer sufficient. The hardware supplier (or suppliers) is not known, though Symantec has partnered with Huawei for its FileStore software.

We have no information about availability or pricing or any further configuration data. Symantec was not immediately able to answer our questions. ®