Of all the myriad of terms that the tech industry throws around at the moment, none is as often subverted for marketing spin as “big data”. So much so that few people can actually agree on what big data is. For me, I’ll revert to Wikipediaand it’s definition which states;

big data is a collection of data sets so large and complex that it becomes awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to “spot business trends, determine quality of research, prevent diseases,link legal citations, combat crime, and determine real-time roadway traffic conditions.”

So, with the definition out of the way, on to the bigger question. Is big data simply a marketing term or is it something that’s actually being looked at within enterprise? A new survey out from RainStor indicates that Big Datais indeed being taken seriously. Here’s a summary of findings.

The promise of big data and its value to the organization – 75.5%of respondents agree that managing their Big Data and making it available across the enterprise was important to improve overall business value.

Velocity and Variety of Data continue to present some of the biggest challenges – the survey reveals that the speed of data creation (velocity) and increase in data types (variety) are a main challenge in addition to the ability to provide analytics against this data getting 37% of respondents vote.

New Skills are needed – lack of relevant skills in newer technologies such as Hadoop was a prominent theme whereas standard SQL and SQL statements still appear to be the “enterprise standard” when running queries and analysis against existing data warehouses.

MyPOV It’s clear that the analysis and extraction of insight from the ever-increasing quantity of data available to an organization is, and will continue to be increasingly, critical. That said, it’s also clear that we’re at a very early stage in the process, the closest most organizations get to “big data” is running traditional data warehouse BI operations. There is however a convergence of infrastructure availability (powered by the cloud) alongside ever increasing quantities of data sets (from social and other streams). Combine these two trends with some new ways of analyzing unstructured data and you have a space that is going to continue growing. The key to ensuring that organizations can actually use this data for positive outcomes however is to simplify the analysis of this data. With the ever increasing demand for data scientists, it will increasingly be important to automate the identification of what is, and is not, valuable data. Other than the largest enterprises, organizations cannot afford to invest in in-house data scientists and it is for this reason that both traditional approaches to querying data and a new generation of automated data extraction tools will come to the fore in the next few years. It is worth nothing that nearly 90% of respondent are still using SQL query and SQL statements in their data warehouse environment – an indication that for all the hype, Hadoop and MapReduceaccess technologies are only currently available to the most advanced of organization.

Ben Kepes is a technology evangelist, an investor, a commentator and a business adviser. His business interests include a diverse range of industries from manufacturing to property to technology. As a technology commentator he has a broad presence both in the traditional media and extensively online. Ben covers the convergence of technology, mobile, ubiquity and agility, all enabled by the Cloud. His areas of interest extend to enterprise software, software integration, financial/accounting software, platforms and infrastructure as well as articulating technology simply for everyday users.

2 responses to “Big Data–Over Hyped Buzzword or Enterprise Focus?”

￼Forrester defines Big Data with 4 main attributes ( volume, velocity, variety, value ) but in light of the significant growth of unstructured data ( 80% of new data ) a better fourth attribute is variability which is complex interrelationships between data managed in different applications where need for semantics is imperative.

Hadoop/NoSQL is usually associated with Big Data for processing large-scale distributed data but this combination only addresses volume and velocity (somewhat). However the biggest challenges CIO’s must address today are unifying diverse unstructured data sources where variety and variability are the most crucial or imperative needs to derive business value and intelligence.