User survey: ILM and data classification

Posted on February 01, 2007

On the one hand, a lot of IT managers still aren’t sure what ILM is. On the other hand, a lot of them have implemented it to some degree.

By Farid Neema

In the screening survey portion of Peripheral Concepts’ recent end-user survey on information lifecycle management (ILM), which had more than 4,000 respondents, about one-third of the IT managers weren’t sure what the term meant, and 18% knew what ILM was but didn’t know if their company had implemented it. Of the remaining half, 26% claim they have implemented ILM and 7% plan to implement it within the next 12 months, leaving 15% with no implementation plans for ILM (see figure, below). In general, the more raw disk capacity a company has, the more likely the company is to have implemented at least some ILM features.

In the full survey portion of our research, which had 117 respondents, we first defined ILM as follows:

Information lifecycle management is a comprehensive approach to managing the flow of an information system’s data and associated metadata from creation to deletion. ILM involves all aspects of dealing with data, starting with user practices, rather than just automating storage procedures as, for example, hierarchical storage management (HSM) does. Within this definition, an ILM deployment process usually encompasses the following key steps:

When asked to what extent they have achieved each step, few IT managers believe they have fully implemented any of the steps toward ILM (see figure, right). Among those respondents who do not have any plans to implement ILM, the primary reasons were lack of staff and money. The median annual IT budget for the surveyed corporations was $2 million.

Click here to enlarge image

ILM is implemented progressively. Our survey shows that, among those who have implemented ILM, the median population uses ILM for about 20% of their data, but most of those respondents plan to extend their utilization over the next few years.

Click here to enlarge image

Faster backup and recovery is one of the major reasons for implementing ILM. Users also cited better storage utilization, compliance conformance, and improved data protection as key reasons for implementing ILM.

Scalability is at the top of the features that IT managers want to have in an ILM product, followed by automatic data retrieval by users, disaster recovery, and remote replication.

While the screening survey addressed a widely distributed population in terms of raw disk capacity (see figure, below),in the full surveywe primarily targeted four capacity tiers with the following characteristics:

Click here to enlarge image

40% in Tier 1: disk capacity in the 1TB to 10TB range.

29% in Tier 2: 11TB to 50TB

19% in Tier 3: 51TB to 200TB

10% in Tier 4: 201TB to more than 1PB

An additional 2% of the companies in the full survey had less than 1TB.

The survey also revealed where companies stand with regard to implementation of two major ILM components: data classification and tiered storage

Data classification

Data classification is a process that defines the access, recovery, and discovery characteristics of an enterprise’s different sets of data, grouping them into logical categories to assign them service-level objectives based on their value to the business. Data classification is a first step on the road to ILM. It reduces business risks by ensuring the appropriate data is managed with the appropriate standards for compliance, retention, protection, and security.

Click here to enlarge image

Much of the need for classification is driven by the consolidation of previously scattered data from various sources. This stresses the issue that data varies in its importance, accessibility requirements, and value.

Almost one-third of the ILM users in our survey have identified four levels of classification, while about 27% of the population has five or more levels of classification (see figure, above).Although the ability to classify older data is important to users, it is rated much lower than security, which is ranked at the top of the classification parameters. Data classification is applied mainly to database applications, followed by e-mail applications.

About 25% of the survey population rates data classification “very” or “extremely” important, and faster retrieval is cited as the most important reason why users want to classify data.

Tiered storage

IT managers are looking for cost-effective storage resources for storing less important data to preserve the performance of their critical data, as well as to maximize the efficiency of their IT spending. Online processing transaction applications require high-performance storage resources with the highest level of reliability to ensure uptime. Environments with less-intensive workloads can use more cost-effective storage resources.

Tiered storage establishes a hierarchy of storage systems based on service requirements such as performance, business continuity, security, protection, compliance, and cost.

Adoption of tiered storage is fueled in part by the explosion of low-cost, high-capacity Serial ATA (SATA) disk arrays. These capacity-oriented systems have found a home in tiered storage architectures for secondary storage, backup, online archives, and other applications.

Compliance with regulations is the major reason for implementing tiered storage, and continuous data protection (CDP) is the service that users would most like to see included in a tiered storage offering, followed closely by backup, replication, and snapshots (see figure, below).

Conclusion

An evaluation of ILM implementation is obtained by analyzing the level of completion of each of its components. One major reason IT managers are reluctant to install ILM components is fear of additional management complexity. However, many applications can be serviced by ILM components without jeopardizing the corporate IT storage infrastructure. A company’s management of storage can be greatly enhanced by correctly deploying each phase of ILM.

Greater emphasis on security, compliance, and recovery performance stand out as the major reasons for the increased attention given to ILM, data classification, and tiered storage.

Peripheral Concepts (www.periconcepts.com) conducted the ILM and data classification survey in November 2006 and targeted IT managers with storage management or data-protection responsibilities.

More than 4,000 IT managers completed the screening survey, which had fewer than 10 questions. More than 100 IT managers completed the full survey, which had more than 60 questions.

Farid Neema is president of the Peripheral Concepts research firm (www.periconcepts.com).

Please enable Javascript in your browser, before you post the comment! Now Javascript is disabled.