To the experienced observer, Big Data propaganda may well appear to be a disorganised surfeit of half-truths, sleights of hand and boloney. Indeed, the once famously alliterative characterisation of Big Data as defined by volumes, variety and velocity, seems now more appropriately applied to the quantity, invariability and quality of the incessant self-aggrandising hype, hokum and Hadoop being astro-turfed by every dog and his guru. Indeed, the very fact that such an inevitable mega-trend needs so much hype, disingenuousness and spin to support its passage to universal applicability, is a massive contradiction, a disservice to professionals, and an artless deception worthy of our criticism and condemnation.

The Big Data Success Story That Never Was

“Researchers have realized that Twitter updates can more quickly and more accurately predict flu outbreaks than traditional CDC tracking methods — in fact, Twitter data can predict an outbreak up to 8 days in advance with more than 90 percent accuracy.”

All fine and dandy, except that it’ lacks a certain degree of veracity.

However, Time, not the fastest cookie off the news block, reported nine months earlier that the whole Google Flue Trends shtick was pseudo-academic hokie. Which leads me to ask, so why was Marr still peddling the same erroneous meme nine months later?

As educator Harvey V. Fineberg said “The flu is very unpredictable when it begins and in how it takes off.” Therefore, ‘Man flu’ may be responsible for many erroneous claims and dodgy blog pieces.

Big Data Cuckoo

When is a Big Data project not a Big Data project? When it is a Data Warehouse success story that has been dressed up as a Big Data victory and taken out on the town.

What do I mean?

Oh god, I have to mention the same Big Data guru again.

Please forgive me, Bernard, but I really do have to get this off my chest, again.

If anything, it was a success story about near-real-time Data Warehousing and BI, in a implementation which leverages column-oriented and massive-parallel storage and retrieval.

I posted a question on the comments section of this piece. Of course, it was easier to ignore it than to refute it. This was the question:

“Mr Marr, can you explain why this is about Big Data and not Data Warehousing (including ODS) and BI?”

Well, no surprise there then.

Big Data without Hadoop

To listen to some Big Data pundits one may be lead to believe that the only Big Data technology game in town is HDFS, Hadoop spin-offs and commodity hardware. Whilst I can see the attraction of focusing on just one collection of approaches to Big Data processing, such as that found in the Hadoop techno-sphere, I don’t think that creating a de-facto ‘go to’ technology of Hadoop will be doing anyone any favours. Hadoop is not the Microsoft Office of Big Data technology, and there are other technologies around equally worthy of consideration, technologies such as Lustre, GPFS-SNC and Isilon, to name just three.

Is Big Data all boloney?

In spite of all the nonsense written and spoken about Big Data, the ability to process mass data traffic is of value. For example, the ability to carry out statistical analysis on increasingly larger datasets could potentially yield benefits in areas such as health and safety, human rights and climate change. Indeed, anything that increases our understanding of ourselves, and the world around, is is to be welcome. That is, if it doesn’t lead us down the avenue to a dystopian data-driven nightmare. However, I believe that is not going to happen, mainly because we will place Human Rights well above anything that people with bad intentions will extrapolate from Big Data. Right?

That’s all folks!

Well, is that all there is?

Of course, there is much more to the world of data, as those who already know will know, and there are many success stories brought about with data, stretching back into the mists of time. Indeed, we don’t need to be second-sighted to think that there will be more data-based success stories in the future. However, be on guard. In order to avoid ‘tears’ we just need to be aware of the shell games, the false-friends and the bad-advice, and we should actively seek to avoid the snake-oil merchants, the one-trick ponies and the indentured gurus peddling useless self-interested vapour-ware.

Last but not least, watch this space. Big Data alchemy isn’t going away anytime soon, and neither is the desire to turn lead into gold, squeeze water from a stone or win the lottery of lotteries, three times in a row.

“Our enemies are innovative and resourceful, and so are we. They never stop thinking about new ways to harm our country and our people, and neither do we.” – George Bush (5th August 2004)

Many thanks for reading.

If you want to connect then please send a request. I you have any questions or comments then fire them off below. Cheers 🙂