Big data: An overview

Summary:Data is being generated about the activities of people and inanimate objects on a massive and increasing scale. We examine how much data is involved, how much might be useful, what tools and techniques are available to analyse it, and whether businesses are actually getting to grips with big data.

Big data vendors

If you're looking to exploit big data in your business, who are the vendors you should be considering? As might be expected, there's a great deal of activity in this area, with many startups, a few emerging 'star' companies, and established database vendors working hard to adapt to the latest developments in data management, analysis and visualisation.

Current and future 'stars' of the big-data world are likely to be found among the 'pure play' vendors who derive 100 percent of their revenue from this market. These are graphed below, along with MarkLogic, whose big data revenue Wikibon estimates to be 88 percent of its total. This established company (founded in 2001) is the leader (in revenue terms) among those that specialise in Hadoop or NoSQL solutions (highlighted in red). Also prominent in the Hadoop/NoSQL community are Cloudera, MongoDB (formerly 10gen), MapR and Hortonworks:

None of Wikibon's top four pure-play big data vendors are Hadoop/NoSQL specialists: CIA-fundedPalantir initially concentrated on data mining for US intelligence and law enforcement agencies, but its software is increasingly widely used in mainstream business; fast-growing Splunk majors on searching for, capturing, indexing, analysing and visualising machine-generated data; Opera Solutions offers big data analytics as a service in a number of business sectors; and Mu Sigma integrates a variety of commercial and open-source tools and technologies into a 'decision support ecosystem', placing much emphasis on training data scientists in its own internal 'university'.

When we look at all big data vendors in Wikibon's analysis (excluding those in which hardware accounts for more than 50% of their big data revenue), we find several classes of company heading the revenue chart: broad-portfolio tech giants (IBM, HP, Oracle, EMC); leading software houses (Teradata, SAP, Microsoft); and professional services companies (PwC, Accenture):

Also represented on the all-vendors chart are web behemoths like Amazon and Google. Big data analytics is part of these companies' internal DNA, and they have turned their expertise and infrastructure into products and services such as Elastic MapReduce and Redshift (Amazon), and BigQuery (Google).

The sheer number of companies involved in big data and the revenues being generated show that it's definitely not all hype. As ever in a developing market, we can expect plenty of future merger-and-acquisition activity as established companies cherry-pick the startups and growing firms jostle for position.

Outlook: Big, and getting Bigger

The size of the 'digital universe' is growing apace, as is the number companies involved in developing tools and techniques for managing, analysing and visualising big data. Many companies (especially large enterprises, which by definition routinely deal with 'big' data) are already exploiting big data, but despite widespread awareness of the potential benefits, it has yet to achieve mainstream adoption.

The database world now has two camps: the internet-centric, open-source-based world of scalable distributed databases, where much of the recent big data innovation has occurred; and the enterprise-centric world of traditional, heavily siloed, relational database management systems, where much of the expertise needed to actually run businesses resides. Finding ways to get the best from both worlds, creating a new generation of 'data scientists', will be key to big data's journey from hype to the mainstream.

Big data may well spend some time in Gartner's 'Trough of Disillusionment' as the various barriers to mainstream adoption are dismantled, but there's just too much valuable data out there for it to remain there for long.

Hello, I'm the Reviews Editor at ZDNet UK. My experience with computers started at London's Imperial College, where I studied Zoology and then Environmental Technology. This was sufficiently long ago (mid-1970s) that Fortran, IBM punched-card machines and mainframes were involved, followed by green-screen terminals and eventually the pers...
Full Bio