Essential Guide

Curbing data storage capacity demand

This series of tips provides insight on common ways IT pros can make more efficient use of capacity. Learn which methods work, and which don't.

Sections

Share this item with your network:

Introduction

Depending on the analyst you consult, data
growth is driving data storage capacity demand within businesses at a rate of 40% to 650%
annually. If that strikes you as an extraordinarily wide range for an analyst estimate, it is. And
there are two explanations.

First, no one really knows how fast data is growing. Second, capacity
demand trends have little to do with actual data growth trends. They are based instead on
estimates of how much capacity consumers buy year over year, not on how fast data is growing.

That means planners who want to work out a capacity
management strategy are starting with little more than a mandate from management to bend the
storage cost curve -- recognition of the fact that storage now accounts for between 33 cents and 70
cents of every dollar spent on IT hardware. The heavy lifting of identifying real capacity
requirements, growth drivers, and procedural and technological approaches for reducing
capacity demand is entirely on them.

In 2011, Framingham, Mass.-based IDC projected there were 21.2 exabytes of external storage
deployed worldwide. This was used to store not only production data (roughly 55% of which are
files, according to the analysts), but also data duplicates and dreck. According to the analyst, we
used about half of our disk to store copies of the data written on the other half. And our
reluctance to throw away anything has made our storage infrastructure into something approximating
the kitchen junk drawer.

Recently, with the introduction of flash
memory-based storage devices, so-called silicon storage devices, a "new" Tier 0 has been
introduced into the storage hierarchy. Technically, silicon storage has always been a part of
storage tiering architecture. IBM's hierarchical storage management (HSM)
paradigm -- in existence since the earliest days of mainframe computing -- typically included
system memory, direct access storage
devices (DASDs), which are essentially disk arrays and tape.

The purpose of multiple storage tiers, and the software functionality inherent in HSM to move
data between tiers, was simply to manage storage capacity and cost. The scheme was predicated on
data access frequency and data modification frequency characteristics. Data that was accessed and
updated with high frequency used silicon storage. However, this storage was extremely costly and
limited, so data was migrated as quickly as possible to DASDs, from Tier
0 to Tier 1, where access and update could be accomodated at fairly high rates. In a classic
HSM strategy -- articulated when DASDs were the size of refrigerators, provided limited capacity,
and required their own buildings (DASD farms) to handle power and HVAC requirements -- pressure was
on to migrate data as quickly as possible from disk to tape, which was the storage capacity tier
(then Tier 2) optimized for storing data that was much less frequently accommodated at fairly high
rates of access or modification.

Without belaboring the point, tiered
architecture and HSM provide a straightforward methodology for capacity management but one,
unfortunately, that did not transition into the distributed computing environments deployed in many
firms. Part of the reason is historical and technical: Early distributed computing environments
relied on low-speed LANs to interconnect minicomputers (servers) and microcomputers (PCs) that
could not handle the burden of HSM data movements. Moreover, the industry sought to expand disk
products to provide specialized capacity storage that would compete with tape. High-capacity,
low-cost SATA disk arrays, some featuring "data reduction" value-added software (so-called deduplicating
virtual tape library [VTL] appliances) were among the first, followed by tiered storage arrays
that provided trays of both Tier 1 and Tier 2 disks, as well as HSM software to automatically move
data from one tier to the other; and finally, massive arrays of idle disks were
tested in the market as a new capacity storage tier.

But the cost of specialty disk appliances, especially with the price acceleration generated by
value-added software embedded on the array controller, has limited adoption. Where products such as
deduplicating VTLs have been adopted, they've mostly been relegated to a niche role -- augmenting
rather than replacing tape, which continues to store roughly 80% of the world's data.

What's needed to manage
data storage capacity isn't an appliance that crams more data onto the same number of spindles,
but a strategy that leverages the right storage tier to store the right data. Instead of focusing
narrowly on capacity allocation efficiency -- which is the point of data reduction technologies
such as compression and
deduplication -- planners need to consider capacity utilization efficiency. That's a fancy way of
saying that an effective capacity management strategy includes not only tactical space management
(deduplication and compression), but strategic data management (archiving, for example).

The process begins by analyzing your situation. Using a storage management reporting tool such
as SolarWinds'
Storage Manager (formerly Tek Tools Storage Profiler), you can run a report that identifies
files that haven't been accessed or modified in the last 30, 60 or 90 days. Sorting these files by
their owners (also in the file metadata) will provide a way to begin a dialog with the user (or his
or her manager) who owns the files so that those files can be moved into an archive or deleted.

As much as 40% of the data stored to disk currently could be more cost-effectively hosted in an
archive
platform, whether disk-based, tape-based or in a cloud service. The savings from archiving data
and returning 40% of your capacity back to productive use may provide enough savings to pay for
your entire data storage capacity management strategy going forward.

1Wasted space-

Making the most of disk storage: How not to waste space

According to Jon Toigo, CEO and managing principal of Toigo Partners International, and chairman of the Data Management Institute, one of the biggest problems with data storage capacity is wasted space. This is largely due to enterprises storing stale data, such as duplicates, data with low re-reference rates or orphan data. Furthermore, many enterprises don't have a method in place to determine which data can be deleted or moved to an archive.

Toigo explains how hierarchal storage management and other technologies can help make the most of disk capacity. Continue Reading

2Thin provisioning-

Refuting common capacity containment methods

The idea behind thin provisioning is that storage admins know well in advance when they'll have to add more capacity to an environment, and can avoid purchasing excess capacity by waiting until it's actually needed. The problem, Toigo says, is that thin provisioning does nothing to reduce capacity directly; rather, it alleviates the cost of additional disk arrays.

Get Toigo's view on the capabilities and limitations of thin provisioning, and how it can help with data storage capacity allocation and reduced storage cost. But beware: It isn't a flawless strategy. Continue Reading

Download this free guide

8 Steps to Expanding and Replacing Your Disk Arrays

Regardless of how high performing your storage array or how well you manage your data, every system comes to a breaking point. Whether it’s long app loading times, or realizing after a disaster that your data isn't quite as available as you thought, when you reach that point, you’ll need to be ready to expand or replace your existing disk array. In this guide, learn 8 things to take into consideration for when that time comes; it could be closer than you think.

By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent.

By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

3Dedupe and compression-

Some capacity fixes only work short-term

Deduplication and compression are seen as surefire ways to reduce data storage capacity, and there's no denying they can make a difference -- to an extent. Deduplication can eliminate copies of data that aren't needed, but arrays with a deduplication process built in are often more expensive. Compression allows the same amount of data to be stored on a smaller amount of capacity, but it's not clear whether the capacity savings are worth the price tag.

The value of dedupe and compression software is lowering, according to Toigo, and there are more effective tools to help you garner effective capacity savings. Continue Reading

4ILM strategy-

How the right information lifecycle strategy can save capacity

Data lifecycle management (DLM), also referred to as information lifecycle management (ILM), isn't a new concept, but it's often overlooked when it comes to keeping data storage capacity requirements under control. Creating policies that automate the movement of data is the basis for DLM. For example, all data created by a certain department within an organization can be tagged as such in the metadata, and from there can be directed to specific storage. This is a big benefit to storage professionals when it comes to determining which data is stored where, and controlling the amount of capacity on a given array.

0 comments

Register

Login

Forgot your password?

Your password has been sent to:

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy