STORAGE

Deduplicating Elsewhere

Deduplication technology discussions usually center on deduplicating the backup target. That makes sense, as this is where the biggest payoff is for the technology. Increasingly, the discussion is moving more to using deduplication as a part of archive disk or primary storage. Deduplication, however, is also branching out beyond standard disk, and there are areas to consider whether applying the technology is worth the investment.

Deduplication technology discussions usually center on deduplicating the backup target. That makes sense, as this is where the biggest payoff is for the technology. Increasingly, the discussion is moving more to using deduplication as a part of archive disk or primary storage. Deduplication, however, is also branching out beyond standard disk, and there are areas to consider whether applying the technology is worth the investment.

Solid State Disk (SSD) could hold the most promise. It's more expensive than its mechanical drive counterparts, but it is also substantially faster. Clearly, adding a deduplication/compression capability to an SSD system will impact its performance. However, many -- maybe even most -- environments don't need the full performance boost that an SSD can provide. If an SSD system takes even a 25 percent performance hit, it would still be substantially faster than many mechanical systems, and if in doing so you doubled the capacity of the SSD, you effectively cut the cost of the technology in half. It's not an isolated pocket of data centers that this rational applies to. Many need a performance bump beyond what their mechanical drives can deliver but don't need the full performance boost of standard SSD. For these environments, an SSD with deduplication and compression may be the perfect solution.

Tape as a deduplication target may seem a little odd. In essence, it is the opposite of SSD. We move from very high performance, limited capacity but expensive media to medium performance, high capacity and inexpensive media. The justification is simply the sheer quantity of that media. In environments where the number of tapes to manage is in the thousands, reducing that by a factor of 10 or 20 percent could represent a substantial cost savings. This would not be a savings only in the cost of the actual media, but in the cost of storing that media off-site, as well as the reduced cost in returning those tapes to the data center when needed. The concept of deduplicating to tape makes many admins nervous, and I think the jury is still out on whether on not this makes sense. You need to really weigh the potential cost savings vs. any potential risk associated with using the technology on tape.

Cloud storage is another area where deduplication will gain traction. Technically, cloud storage is still storage, but as we discuss in our article Cloud Storage Deduplication, it's storage with a bandwidth cost associated with it, and it is storage that is often billed at per GB used. The more you save with both, the better off you are. The storage savings can be relatively simple and can be done entirely at the destination side, but if that technology can be moved to the source data center and kept in deduplicated format, the value increases.

Many cloud services are using some form of a on-premise cache as a gateway to the cloud. Gaining better storage efficiency with the on-premise gateway allows for more data to be kept in the local cache. If that data can be kept in its compressed and deduplicated form, it is going to use significantly less bandwidth. This is important because some vendors charge for bandwidth used as well as storage used. In either case moving less data across a slow segment is always a good thing. Deduplication technology will continue to be used in more than just the traditional storage tiers that we think of today. In any situation where you have a relatively high cost of capacity or a relatively low availability of that capacity, there may be some justification for the implementation of the technology.