It’s unlikely you’ll be pouring freezing water over your head for it, but the marketing world is experiencing its own Peak Oil crisis.

Yes, you read correctly: we don’t have enough data. At least not enough good data.

Pull up to any marketing RSS and you’ll read the same story: the world is awash in golden insights, companies are able to “know” their customers in real time and predict more and better about their own market … blablabla.

Here’s what you won’t read: it’s really, really hard. And it’s getting harder, for the simple reason that we are all positively drenched in … overwhelmingly bad data. Noisy, incomplete, out of context, approximate, downright misleading data. “Big Data” = (Mostly) Bad Data as it tends to draw explicit behavior from implicit and noisy sources like social media or web visits.

Traditional market research methods are getting less reliable due to dropping response rates, especially among young, tech-savvy consumers. To counteract this trend, marketing research firms have hired hundreds of PhDs to refine the math in their models and try to build a better picture of the zeitgeist, leveraging social media and implicit web behavior. This has proven to be a dangerous proposition, as modeling and research firms have fallen prey to statistics’ number one rule: garbage in, garbage out.

No amount of genius mathematical skills can fix Bad Data, and simple statistical models on well measured data will trump extensive algorithms on badly measured data every single time. Sophisticated statistical models might help in political polling, where people are far more predictable based on party and demographics, but they won’t do anything to help traditional marketing research, where people’s tastes and positions are less entrenched and evolve more rapidly.

Parsing the exact sentiment behind a “like”, a follow or a natural language tweet is extremely difficult, as analysts often lack control over the sample population they are covering, as well as any context about why the action occurred, and what behavior or opinion triggered it. Since there is no negative sentiment to use as control, there is no ability to unconfound good with popular. Natural language processing algorithms can’t sort out sarcasm, which reigns supreme on social media, and even the best algorithms can’t reliably categorize the sentiment of more than 50% of Twitter’s volume of posts. Others have pointed out the issues with developing a more than razor-thin understanding of consumer mindsets and preferences based on social media data. What does a Facebook “Like” mean, exactly? If you “like” Coca-Cola on Facebook, does it mean that you like the product or the company? And does it necessarily mean you don’t like Pepsi? And what is a “like” worth? Nobody knows.

This is where we come in. We at Ranker have developed a very good answer to this issue: the “opinion graph”, which is a more precise version of the “interest graph” that advertisers are currently using.

It’s very simple: instead of the vaguely positive act of “liking” a popular actor on Facebook, Ranker visitors cast 8 million votes every month and thus directly express whether they think someone is “hot”, “cool”, one of the “best actors of all-time”, or just one of the “best action stars”. Not only that, they also vote on other lists of items seemingly unrelated to their initial interest: best cars, best beers, most annoying TV shows, etc.

As a result, Ranker has been building since 2008 the world’s largest opinion graph, with 50,000 nodes (topics) and 20 million edges (statistically significant connections between 2 items). Thanks to our massive sample and our rich database of correlations, we can tell you that people who like “Modern Family” are 5x more likely to dine at “Chipotle” than non-fans, or people who like the Nissan 370Z also like oddball comedy movies such as “Napoleon Dynamite” and “Big Lebowski”, and TV shows such as “Dexter” and “Weeds”.

Our exclusive Ranker “FanScope” about the show “Mad Men” lays out this capability in more details below:

Our opinion data is also much more precise than Facebook’s, since we not only know that someone who likes Coke is very likely to rank “Jaws” as one of his/her top movies of all time, but we’re able to differentiate between those who like to drink Coke, and those who like Coca-Cola as a company:

We’re also able to differentiate between people who always like Pepsi better than Coke overall, and those who like to drink Coke but just at the movie theater:

47% of Pepsi fans on Ranker vote for (vs. against) Coke on Best Sodas of All Time

65% of Pepsi fans on Ranker vote for (vs. against) Coke on Best Movie Snacks

That’s the kind of specific relationship you can’t get using Facebook data or Twitter messages.

By collecting millions of discrete opinions each month on thousands of diverse topics, Ranker is the only company able to combine internet-level scale (hundreds of thousands surveyed on millions of opinions each month) with market research-level precision (e.g. adjective specific opinions about specific objects in a specific context).

We can poll questions that are too specific (e.g. most memorable slogans) or not lucrative enough (most annoying celebrities) for other pollsters. And we use the same types of mathematical models to address sampling challenges that all pollsters (internet or not internet based) currently have, working with some of the world’s leading academics who study crowdsourcing, such as our Chief Data Scientist Ravi Iyer, and UC Irvine Cognitive Sciences professor Michael Lee.

Our data suggests you won’t be dropping gallons of iced water on your face over it. But if you’re a marketer or an advertiser, we predict it’s likely you will want to pay close attention.