The 7 Most Data-Rich Companies in the World?

Some companies really get big data. Not only do they realise size matters – they understand you also have to know what to do with it. Here’s a list of seven companies I think are at the top of the game, when it comes to cutting-edge use of data to strategically achieve business goals. If you run a business yourself and are interested in big data projects, there is something to be learned from every one of these. So in no particular order …

GE – with its fingers in every pie from finance to aviation to power, is perfectly positioned to benefit from its championing of “The internet of things”. They clearly see that IOT – the concept that every device can be networked and learn to communicate with other devices in the same way that computers do – is key to huge efficiency savings and potentially revolutionary business change.

As a result they are heavily investing in what they call the Industrial Internet – the subset of IOT dedicated to industrial devices and equipment. Aircraft engines are being fitted with arrays of sensors capable of detecting and measuring the slightest changes, meaning that fine-tuning for efficiency is possible to a higher standard than ever before. And the same is just as true with their medical equipment and power station turbines. In 2012 the company announced it was investing $1 billion into its data projects over four years.

In 2003 50,000 IBM staff took part in online interviews where they were asked about key business issues, and the direction they thought the company should be heading in. Those interviews were fed into textual analysis software designed to pick out the most common phrases and themes, which became new company objectives.

This was forward-thinking in many ways and encapsulates the idea of a company transforming itself into a data enterprise. Those at the top had come to the conclusion that data in most fields will always trump opinion – even their opinion – and surrendered themselves to something of a destruction of the ego; “letting go” (temporarily) of the reigns and seeing what direction the company would head, steered by science and statistics, rather than the possibly jaded or entrenched ideas and opinions of directors and senior managers.

Since then IBM has reinvented itself as a data powerhouse, at the forefront of the current boom in business-to-business data infrastructure services. It offers hardware and software for maintaining big databases, such as its DB2 database application and SPSS analytics application, among many other products and services.

It has also become an ambassador for the concept of big data, publishing several papers on how companies can exploit its potential for innovation and increased profits. Books have been written on the turbulent history of this particular tech giant, but by embracing big data with such enthusiasm, they are entering a new chapter.

Amazon not only brought big data to the masses, it made it personal – and customer service was changed forever. One of the shortfalls of online shopping for early adopters of the habit was the lack of a sales assistant or shopkeeper to explain the products and, by getting to know you, helping you find whatever it is you need to solve a particular problem in your life.

With its recommendations and reviews-base structure, Amazon introduced us to the super-powered sales assistant – equipped with a super memory retaining every customer transaction and able to offer lightning-quick, and most importantly accurate, suggestions. In fact, it’s got so good at this that according to rumor (based on patent applications) it is planning to begin predictive shipping – automatically sending out parcels of books, DVDs, videogames and gadgets based on what it thinks its users will want to pay for.

Amazon is clearly not blind to the vital role data has played in its success, and has used the vast revenues (if not profits) built up by pioneering online retailing to invest in also providing data services. Much like IBM mentioned above it provides infrastructure to allow other businesses to capitalize on data gathering, storage and analysis enterprises. For more on Amazon, see my posts How Amazon Uses Big Data to Boost It’s Performance and Amazon: Using Big Data To Read Your Mind.

Facebook has revolutionized the way we communicate with each other, from staying in touch with relatives to organizing weekend activities with friends. There was instant messaging and email before it, but Facebook invited users to build the world’s largest directory of people. It then made them all accessible to each other – depending, in theory, on privacy settings determined by each user. With 1.32 billion active users, it is still by far the world’s largest social network.

In the process, it has collected probably the biggest database of personal information in the history of the world. Its users upload 30 billion pieces of content between them every day, resulting in over 300 petabytes (3 million gigabytes) of information. It has used this information to draw in advertisers, generating $2.68 million in advertising revenue during the last quarter.

This year the company made moves in an unexpected direction by purchasing the upcoming Oculus Rift virtual reality technology for $2 billion. Speculation says the company is looking ahead to times when we want to be able to experience greater levels of interaction with our data (or our friends, in their Facebook digitized form) than current flat screen technology allows. For more on Facebook, see my postFacebook’s Big Data: Equal Parts Exciting and Terrifying.

No list of the top big data businesses would be complete without mentioning the still-undisputed king of search. Like Facebook, it turned data collection and analysis into a business model by providing a service ostensibly for free, then selling on information it gathers about us by monitoring the way we use that service.

Search is still the key service it provides – and since the early days when its algorithms were first recognized for their superiority at matching what the user is typing, with what they are looking for, they have continued to evolve – moving towards a standard of “natural language processing” which is planned to one day let us converse with computers as easily as with people.

Its activities have often caught the public imagination. From the blistering speeds that it reports (“smugly” as one comedian described it) it has trawled millions of web pages to find what you’re looking for, to the breathtaking scope of Google Earth, consistently providing services that people want to use – for education, business or just passing time.

Google offers a range of services – now collected at the Business Hub – to aid with promotion, and has also moved firmly into providing more heavyweight big data services to businesses. These include BigQuery – its analysis engine, and Google Cloud Storage services. For more on Google, see my article: Wow! Big Data At Google.

Less well-known than the other companies I’ve mentioned here, Cloudera has emerged in recent years as one of the most prominent suppliers of Apache Hadoop solutions. Apache Hadoop, as I’ve mentioned before is a suite of software applications designed for running big data enterprise operations. Although open-source (free) in its raw state, an industry has sprung up providing companies with custom-configured systems, intended to simplify the process of data gathering and analysis. Cloudera is a leader in this field, and clearly realises the obligation it owes to the free technology on which it is built, returning a share of its profits to the voluntary foundation which maintains Hadoop. For more on Hadoop, see may article: What’s Hadoop? Here’s a Simple Explanation For Everyone.

Another newcomer – built from the ground up as a big data business, rather than a dinosaur forcing itself to evolve. Kaggle pioneered data science as competition – offering rewards for solving various challenges faced by industry.

Companies post problems they are attempting to overcome – for example, to match movies on a streaming service with what the customer may want to watch next, alongside sample data sets. Prize money is then awarded to the solution which most comprehensively trumps their existing methods.