Don't Confuse Big Data With Storage

A large part of big data management is knowing what data to analyze, what to back up and what to dump, says disaster recovery expert.

Big Data's Surprising Uses: From Lady Gaga To CIA

(click image for larger view and for slideshow)

How much big data should your organization save? And how much should you back up?

Big data plays an important role in today's business world, but it's not up there with mission-critical applications that are essential to an organization's day-to-day operations. That's according to Michael de la Torre, VP of product management for SunGard Availability Services, an IT services company that provides, among other things, disaster recovery services.

Always remember that not all data has equal value, de la Torre advised. And in most cases, big data is just another business application.

"For most companies it's more of a business-critical app," de la Torre told InformationWeek in a phone interview. "It really doesn't need to be up all of the time -- but you're going to lose business opportunities if it isn't up and running."

This isn't to say, however, that big data doesn't matter. On the contrary, de la Torre sees big data as the next generation of business analytics.

"You have all this non-structured or minimally structured data. There's a lot of it. And it's coming from different sources that you would typically think are outside of the business warehouse," he said. "As such, you need new tools and techniques to get value out of that data."

And part of that decision-making process is figuring out what information needs to saved, and what is expendable.

"Don't just save everything to save everything. That makes very little sense," said de la Torre.

For instance, social media streams -- a classic big data example of high volume, velocity and variety--don't necessarily need to be hoarded for eternity. But other forms of big data may provide great value many years down the line.

"When you think about social (media), so much of the value of that data is that it's very time-dependent. It's very volatile, and it loses its value almost immediately," de la Torre said. "Other data such as weather, where you're doing long-term correlations, will potentially remain viable for years."

OK, so all big data isn't created equal. But what's worth saving?

One solution is to store summary data from a particular time period or event, along with a small amount of anecdotal information. That's better than "saving a million logs," de la Torre advised. "Do you need the summary, or do you need all the detail?"

Obviously, the summary data method is more cost effective and easier to manage than the save-everything approach. It also works with sensor-generated information, a big data category that includes data from field equipment in remote locations.

"Manufacturing companies figured this out a long time ago. You don't store the data from spinning equipment," said de la Torre. "You don't want to pay for the bandwidth costs. You don't need all that data."

The solution: "Put an expert system in place. And ultimately that's what big data is: an expert system that makes meaning out of data," he added.

Like many data professionals, de la Torre believes the term "big data" is mostly marketing hype. "It's advanced business analytics using new sources of data," he noted. "And somebody said, 'Hey, let's call it big data.'"

While it may not be a mission-critical app, big data can provide a lot of value to organizations. For instance, it can help companies find "interesting ways to use their proprietary data, and to create business opportunities from it," said de la Torre.

This Interop webcast, Data Centers Then And Now, will explore how the requirements are changing, have changed, current data center trends, and what needs to change moving forward to meet future business needs. It happens April 18. (Free registration required.)

Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.

Why should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.