How to Define Big Data

“Big data” is a popular term these days – it seems to pop up everywhere. But do people mean the same thing when they say those words?

“Big data” is a popular term these days – it seems to pop up everywhere. But do people mean the same thing when they say those words?

In The Big Data Management Challenge, a recent report from Information Week, Michael Biddick provides a very useful description of what constitutes big data. He suggests there are four elements needed for data to qualify as “big.”

The most obvious is size. A good point of demarcation is around 30 terabytes.

Next is type. Structured data can be easy to work with even in very large amounts, whereas multiple data types (for example, structured, unstructured, plus semi-structured) can be challenging even when data sets are smaller.

One of the most challenging elements is latency. “Really big” data typically changes fast.

Recognizing big data is the first step to managing it successfully – and the second step is establishing a management strategy specifically designed for big data.

But according to the Information Week survey (drawing on 231 IT professionals from organizations with 10 terabytes or more of data), only 33% of respondents could answer “yes” to this question: Does your organization distinguish “data” from “big data,” using distinct tools and management approaches for higher volume, complexity and dynamic data processing?” Fifty-six percent of respondents say “no,” and 11% admit they don’t know.

From that perspective, it’s no surprise to find that only 6% of the survey participants say there were “no barriers” to the successful management of big data at their organizations.

The highest percentage identify “budget constraints” as a major barrier, but many also cite problems of limited awareness and capability within their organizations. “Lack of knowledge of big data tool implementation” is cited by 44% of respondents, “cost and availability of training” by 41%, and “lack of expertise or experience” by 34%.

Perhaps the most interesting insight can be derived from the way the respondents rate their own understanding of big data tools and strategies.

The 231 professionals range from IT director/manager level (33%) and IT/IS staff (38%) to IT execs (9%) as well as a few business executives, non-IT managers, and consultants. Only 8%, however, say they have “ample knowledge” of technologies specifically designed to manage the needs of big data. Most of the survey participants – 63% – describe themselves as “somewhat familiar” with big data technologies, while 25% are “not very familiar.”

All in all, it seems there’s plenty of room in most companies for improvements in the understanding of big data and the implementation of appropriate management strategies.