I'm Managing Partner at gPress, a marketing, publishing, research and education consultancy. Previously, I held senior marketing and research management positions at NORC, DEC and EMC. Most recently, I was Senior Director, Thought Leadership Marketing at EMC, where I launched the Big Data conversation with the “How Much Information?” study (2000 with UC Berkeley) and the Digital Universe study (2007 with IDC). I blog at http://whatsthebigdata.com/ and http://infostory.com/ Twitter: @GilPress

Big Data News Roundup: From Porn to Data-ism

“We need data about big data,” says The Economist, citing Lord Kelvin “to measure is to know.” The Economist asks the question in the context of the decades-old (remember “Intellectual Capital”?) and elusive search for the value of the data companies hold. But its sister company, The Economist Intelligence Unit (EIU), recently indicated there’s value in the data when it’s used (not just accumulated and stored), finding “a clear link between financial performance and use of data.” An EIU survey of 530 senior executives, sponsored by Tableau (PDF of the report here and press release here), found that respondents were more than three times more likely than average to rate their companies as substantially ahead in financial performance when they also rate their companies as substantially ahead of their peers in their use of data.

More data on big data and its potential value comes from a TechAmerica survey, sponsored by SAP (PDF here), which found that 83% of federal IT officials estimate that the use of big data could lead to cost savings of 10% or more. Based on the size of the federal budget, that would translate into $380 billion in potential savings (see study’s summary at InformationWeek).

Another measure of “value” is what companies and government agencies are willing to spend on the promise of big data and Wikibon provides the numbers: It estimates that the big data market will reach $18.1 billion in 2013 and will exceed $47 billion by 2017, an annual growth rate of 31%.

Not everyone in the forecasting business is of the same mind—or, of the same and different two minds. Gartner’s analyst Debra Logan thinks big data is “currently still a solution looking for a problem,” although the firm enthusiastically predicted last year that big data will “drive $34 billion of IT spending in 2013.” Of course, not knowing “what question you are trying to answer” (the biggest big data challenge per Logan), never stopped anyone from spending money on cutting-edge technology.

Logan is not alone in the rapidly growing big data-skeptics camp. Brian Bergstein at MIT Review writes about The Problem with Our Data Obsession, arguing that the growing volumes of data can make us value the wrong things and grow overconfident about what we know. The article is a review of Evgeny Morozov’s new book To Save Everything Click Here: The Folly of Technological Solutionism, “solutionism” being the belief that “with enough data about many complex aspects of life we can fix problems of inefficiency.” Or any problem, period.

Nick Bilton at The New York Times also piles on with “Data Without Context Tells a Misleading Story,” citing an article in Nature about how Google’s celebrated Flu Trends (a.k.a. “predicting the present”) over-estimated the number of Americans with influenza last month. Bilton blames Google’s algorithms for “looking only at the numbers, not at the context of the search results.”

Another dig at Google’s algorithms came from none other than Ben Silbermann, Pinterest’s CEO, categorically stating that crowds of people are better than algorithms at finding content that consumers care about: “We often talk about Pinterest as like a human indexing machine. Google built these crawlers that would go out, and these amazing algorithms. We give people tools that let them organize in a way that makes sense to them, and in doing that they organize in a way that makes sense to other people. It just sort of respects our philosophy of how we want to achieve our mission, by helping people organize things. That organization is different than the approach you would take if you were only using machines.”

Vendors killed it. Well, industry leaders helped, and the media got the ball rolling, but vendors hold the most responsibility for the painful, lingering death of one of the most overhyped and poorly understood terms since the phrase ‘cloud computing.’

Any established vendor offering a storage or analytics product for a tiny or a large amount of data is now branded as big data, even if their technology is exactly the same as it was 5 years ago (thank you, marketing departments!). Startups, too, lay claim to the moniker of “big data app” or “big data startup,” eager to soak up some of the big data money floating around in big data-focused VC funds. The phrase ‘big data’ is now beyond completely meaningless.

To which the Dart-Throwing Chimpreplied: “I get the sense that anyone who sounds excited about Big Data is widely seen as either a fool or a huckster. As Christopher Zorn wrote on Twitter this morning, ‘Big data is dead’ is the geek-hipster equivalent of ‘I stopped liking that band before you even heard of them.’”

But most of the big data geeks haven’t noticed that big data is dead and are busy engaging in very lively debates and new ventures. Pete Swabey at InformationAge talks to Teradata CTO Stephen Brobst who reports “shouting matches” at a recent large database conference at Stanford University between the “Hadoop guys” and the “relational database guys.” Says Brobst: “As an engineer, my view is that when you see this kind of religious zealotry on either side, both sides are wrong. A good engineer is happy to use good ideas wherever they come from.” Great “compromise-seeking” sentiment, but I’m not sure the “Hadoop guys” are terribly happy with his description of Teradata’s customers using Hadoop as “a very low cost repository to store all their data forever.” You mean, Hadoop is only a replacement for tape?

The real split (or “ideological schism” to use Swabey’s term) in the market may be better captured by Wikibon in its new assessment of the big data database market. It estimates that the “relational database guys” (as opposed to non-relational or noSQL databases) are responsible for 79% of the market this year, an impressive share of the market going down to still quite dominant 64% in 2017. Long live tradition!

Defying tradition are the big data startups that still get, death or no death, VCs to support their big data ideas. Playnomics, which is making a science out of figuring out which players are the most valuable to a game company, has raised $5 million in a second round of funding from Vanedge Capital. Existing investors FirstMark Capital and XSeed Capital also participated in the round.

Amsterdam-based Big Data startup Elasticsearch, which operates the real-time search and data analytics open-source project of the same name, has secured a $24 million in series B funding, mostly from Index Futures, with smaller contributions coming from SV Angel and Benchmark Capital. And Sociocast has raised $1 million in new funding from Raptor Ventures, to help it expand its “next-generation” predictive analytics product.

Dead or not, there’s also no lack of ingenious new uses of big data, whether the use of the label is appropriate or not. A recent notable example is “a massive data set of 10,000 porn stars [that] has been extracted from the world’s largest database of adult films and performers.” Jon Millward has spent the last six months analyzing it to discover “the truth” about what the average performer looks like, what they do on film, and how their role has evolved over the last forty years.

And there is no lack of fresh enthusiasm about big data’s potential impact on society. The Atlantic reports in The Robot Will See You Now that “technology enthusiasts… imagine the application of data as a ‘disruptive’ force, upending health care in the same way it has upended almost every other part of the economy—changing not just how medicine is practiced but who is practicing it.”

New York University is among many institutions of higher learning that is trying to fill in the void and it has established a Center for Data Science , “the first such program in the United States” (whose director, Yann LeCun, makes the exaggerated claim that “there is no place to learn data science” right now). The Center will offer a Master’s degree in Data Science, also “educating the next generation of data scientists.” NYU will also open this fall the Center for Urban Science & Progress, which “aims to use big data to help mitigate urban problems like noise, building efficiency, airborne pollution, parking, and where and when one is most likely to find a cab.” CUSP is expected to generate $5.5 billion in economic activity over the next 30 years and lead to 200 spinoff companies, adding about 4,600 permanent jobs, in that same time frame. The first students will start classes this fall and the center’s goal is to employ 100 Ph.D. students, 30 postdocs and 30 faculty members by 2018.

Post Your Comment

Post Your Reply

Forbes writers have the ability to call out member comments they find particularly interesting. Called-out comments are highlighted across the Forbes network. You'll be notified if your comment is called out.

As for Brooks, he’s partially right. No, we haven’t been able to create a machine that can do what the human brain can do. At the same time, though, it’s folly to think that we can’t and don’t benefit from real-life machines.