It’s unlikely you’ll be pouring freezing water over your head for it, but the marketing world is experiencing its own Peak Oil crisis.

Yes, you read correctly: we don’t have enough data. At least not enough good data.

Pull up to any marketing RSS and you’ll read the same story: the world is awash in golden insights, companies are able to “know” their customers in real time and predict more and better about their own market … blablabla.

Here’s what you won’t read: it’s really, really hard. And it’s getting harder, for the simple reason that we are all positively drenched in … overwhelmingly bad data. Noisy, incomplete, out of context, approximate, downright misleading data. “Big Data” = (Mostly) Bad Data as it tends to draw explicit behavior from implicit and noisy sources like social media or web visits.

Traditional market research methods are getting less reliable due to dropping response rates, especially among young, tech-savvy consumers. To counteract this trend, marketing research firms have hired hundreds of PhDs to refine the math in their models and try to build a better picture of the zeitgeist, leveraging social media and implicit web behavior. This has proven to be a dangerous proposition, as modeling and research firms have fallen prey to statistics’ number one rule: garbage in, garbage out.

No amount of genius mathematical skills can fix Bad Data, and simple statistical models on well measured data will trump extensive algorithms on badly measured data every single time. Sophisticated statistical models might help in political polling, where people are far more predictable based on party and demographics, but they won’t do anything to help traditional marketing research, where people’s tastes and positions are less entrenched and evolve more rapidly.

Parsing the exact sentiment behind a “like”, a follow or a natural language tweet is extremely difficult, as analysts often lack control over the sample population they are covering, as well as any context about why the action occurred, and what behavior or opinion triggered it. Since there is no negative sentiment to use as control, there is no aibility to unconfound good with popular. Natural language processing algorithms can’t sort out sarcasm, which reigns supreme on social media, and even the best algorithms can’t reliably categorize the sentiment of more than 50% of Twitter’s volume of posts. Others have pointed out the issues with developing a more than razor-thin understanding of consumer mindsets and preferences based on social media data. What does a Facebook “Like” mean, exactly? If you “like” Coca-Cola on Facebook, does it mean that you like the product or the company? And does it necessarily mean you don’t like Pepsi? And what is a “like” worth? Nobody knows.

This is where we come in. We at Ranker have developed a very good answer to this issue: the “opinion graph”, which is a more precise version of the “interest graph” that advertisers are currently using.

It’s very simple: instead of the vaguely positive act of “liking” a popular actor on Facebook, Ranker visitors cast 8 million votes every month and thus directly express whether they think someone is “hot”, “cool”, one of the “best actors of all-time”, or just one of the “best action stars”. Not only that, they also vote on other lists of items seemingly unrelated to their initial interest: best cars, best beers, most annoying TV shows, etc.

As a result, Ranker has been building since 2008 the world’s largest opinion graph, with 50,000 nodes (topics) and 20 million edges (statistically significant connections between 2 items). Thanks to our massive sample and our rich database of correlations, we can tell you that people who like “Modern Family” are 5x more likely to dine at “Chipotle” than non-fans, or people who like the Nissan 370Z also like oddball comedy movies such as “Napoleon Dynamite” and “Big Lebowski”, and TV shows such as “Dexter” and “Weeds”.

Our exclusive Ranker “FanScope” about the show “Mad Men” lays out this capability in more details below:

Our opinion data is also much more precise than Facebook’s, since we not only know that someone who likes Coke is very likely to rank “Jaws” as one of his/her top movies of all time, but we’re able to differentiate between those who like to drink Coke, and those who like Coca-Cola as a company:

We’re also able to differentiate between people who always like Pepsi better than Coke overall, and those who like to drink Coke but just at the movie theater:

47% of Pepsi fans on Ranker vote for (vs. against) Coke on Best Sodas of All Time

65% of Pepsi fans on Ranker vote for (vs. against) Coke on Best Movie Snacks

That’s the kind of specific relationship you can’t get using Facebook data or Twitter messages.

By collecting millions of discrete opinions each month on thousands of diverse topics, Ranker is the only company able to combine internet-level scale (hundreds of thousands surveyed on millions of opinions each month) with market research-level precision (e.g. adjective specific opinions about specific objects in a specific context).

We can poll questions that are too specific (e.g. most memorable slogans) or not lucrative enough (most annoying celebrities) for other pollsters. And we use the same types of mathematical models to address sampling challenges that all pollsters (internet or not internet based) currently have, working with some of the world’s leading academics who study crowdsourcing, such as our Chief Data Scientist Ravi Iyer, and UC Irvine Cognitive Sciences professor Michael Lee.

Our data suggests you won’t be dropping gallons of iced water on your face over it. But if you’re a marketer or an advertiser, we predict it’s likely you will want to pay close attention.

60+ Everyday Objects That Look Really HappyWhat could be nicer than finding a smiley face at the bottom of your coffee mug? How about living in a house that looks super excited to see you every time you come home? This is a simple gallery of pics of everyday things that look like they are happy.

Essential Products For a Gamer Survival KitNo matter what, gamers gonna game! But for a gamer to play at peak performance levels, there are certain survival essentials for quests, campaigns, and candy crushes. Behold: everything you need to survive an intense gaming sesh.

The Worst Qualities in a PersonLet’s face it: not everyone is perfect. Even the most charming people are guilty of at least a few negative personality traits. But which ones are the worst?

Being a Student Is . . .Being a student isn’t always as easy as it sounds. Sure, there are the parties, the booze, the moving away from home and living without your parents – but there’s also a whole lot of stress, worry, studying, and exams.

The Internet Remembers Robin WilliamsAnd finally, we lost a beloved comedy legend this month. Robin Williams was the face, voice, and talent of a lot of our childhoods and many people took to social media to share touching memories of him. We’ve rounded up the best tributes and would like to invite fans to vote on their favorite Robin Williams movies. R.I.P. Genie.

Former England international player turned broadcaster Gary Lineker famously said “Football is a simple game; 22 men chase a ball for 90 minutes and at the end, the Germans always win.” That proved true for the 2014 World Cup, with a late German goal securing a 1-0 win over Argentina.

Towards the end of March, we posted predictions for the final ordering of teams in the World Cup, based on Ranker’s re-ranks and voting data. During the tournament, we posted an update, including comparisons with predictions made by FiveThirtyEight and Betfair. With the dust settled in Brazil (and the fireworks in Berlin shelved), it is time to do a final evaluation.

Our prediction was a little different from many others, in that we tried to predict the entire final ordering of all 32 teams. This is different from sites like Betfair, which provided an ordering in terms of the predicted probability each team would be the overall winner. In order to assess our order against the true final result, we used a standard statistical measure called partial tau. It is basically an error measure — 0 would be a perfect prediction, and the larger the value grows the worse the prediction — based on how many “swaps” of a predicted order need to be made to arrive at the true order. The “partial” part of partial tau allows for the fact that the final result of the tournament is not a strict ordering. While the final and 3rd place play-off determined the order of the first four teams: Germany, Argentina, the Netherlands, and Brazil, other groups of teams are effectively tied from then on. All of the teams eliminated in the quarter finals can be regarded as having finished in equal fifth place. All of the teams eliminated in the first game past the group stage finished equal sixth. And all of the 32 teams eliminated in group play finished equal last.

The model we used to make our predictions involved three sources of information. The first was the ranks and re-ranks provided by users. The second was the up and down votes provided by users. The third was the bracket structure of the tournament itself. As we emphasized in our original post, the initial group stage structure of the World Cup provides strong constraints on where teams can and cannot finish in the final order. Thus, we were interested to test how our model predictions depended on each sources of information. This lead to a total of 8 separate models

Random: Using no information, but just placing all 32 teams in a random order.

Bracket: Using no information beyond the bracket structure, placing all the teams in an order that was a possible finish, but treating each game as a coin toss.

Rank: Using just the ranking data.

Vote: Using just the voting data.

Rank+Vote: Using the ranking and voting data, but not the bracket structure.

Bracket+Vote: Using the voting data and bracket structure, but not the ranking data.

Bracket+Rank: Using the ranking data and bracket structure, but not the voting data.

Rank+Vote+Bracket: Using all of the information, as per the predictions made in our March blog post.

We also considered the Betfair and FiveThirtyEight rankings, as well as the Ranker Ultimate List at the start of the tournament, as interesting (but maybe slightly unfair, given their different goals) comparisons. The partial taus for all these predictions, with those based on less information on the left, and those based on more information on the right, are shown in the graph below. Remember, lower is better.

The prediction we made using the votes, ranks, and bracket structure out-performed Betfair, FiveThirtyEight, and the Ranker Ultimate List. This is almost certainly because of the use of the bracket information. Interestingly, just using the ranking and bracket structure information, but not the votes, resulted in a slightly better prediction. It seems as if our modeling needs to improve how it benefits from using both ranking and voting data. The Rank+Vote prediction was worse than either source alone. It is also interesting to note that the Bracket information by itself is not useful — it performs almost as poorly as a random order — but it is powerful when combined with people’s opinions, as the improvement from Rank to Bracket+Rank and from Vote to Bracket+Vote show.

50 Incredible Pictures That Just Might Teach You SomethingThis photo gallery includes pictures of natural phenomena, manmade things, the goings-on inside our own bodies, and tons of other cool sh*t that might even teach you a thing or two. The universe is pretty amazing. Let’s look at it together!

50+ Signs That Will Definitely Make You GiggleThese funny signs range from the whimsical, to the witty, to the downright stupid. In an age where image is everything, you would think people would be more careful with their signage.

34 Things Every Man Should Know. Seriously, Take Notes.Dudes. Guys. MEN. When it comes to the male gender, there are certain things all guys simply must know. Whether its for your own personal safety or to not look like an idiot in public, take a moment to learn these things.

This is Real: Selfies at Funerals Are Officially a ThingOf all the occasions to commemorate with a selfie, it would seem funerals aren’t exactly the most appropriate. But these selfie lovers don’t seem to mind. In many occasions, their dearly departed is even in the photo with them!

The Greatest Shower Thoughts Ever ThoughtWe (most of us) bathe in quiet solitude, with neither friends nor social media to entertain us lest we get our devices wet and ruin them. Amidst all that lathering and rinsing, the mind wanders, and for the duration of each shower, anything is possible.

Yesterday’s Technology With Today’s PricesEver wonder how much it would cost to buy a Gameboy if it was released today? Or what people were paying for the privilege of having a cell phone when they first came out? This may stop you from complaining for a while.

Who Will Win The 2014 World Cup?It’s official: the Ranker office has World Cup fever. And so do all of you apparently! Thousands have voted on who they think will win. You may be surprised at who’s currently on top.

Like most Americans, I pay attention to soccer/football once every four years. But I think about prediction almost daily and so this year’s World Cup will be especially interesting to me as I have a dog in this fight. Specifically, UC-Irvine Professor Michael Lee put together a prediction model based on the combined wisdom of Ranker users who voted on our Who will win the 2014 World Cup list, plus the structure of the tournament itself. The methodology runs in contrast to the FiveThirtyEight model, which uses entirely different data (national team results plus the results of players who will be playing for the national team in league play) to make predictions. As such, the battle lines are clearly drawn. Will the Wisdom of Crowds outperform algorithmic analyses based on match results? Or a better way of putting it might be that this is a test of whether human beings notice things that aren’t picked up in the box scores and statistics that form the core of FiveThirtyEight’s predictions or sabermetrics.

So who will I be rooting for? Both methodologies agree that Brazil, Germany, Argentina, and Spain are the teams to beat. But the crowds believe that those four teams are relatively evenly matched while the FiveThirtyEight statistical model puts Brazil as having a 45% chance to win. After those first four, the models diverge quite a bit with the crowd picking the Netherlands, Italy, and Portugal amongst the next few (both models agree on Colombia), while the FiveThirtyEight model picks Chile, France, and Uruguay. Accordingly, I’ll be rooting for the Netherlands, Italy, and Portugal and against Chile, France, and Uruguay.

In truth, the best model would combine the signal from both methodologies, similar to how the Netflix prize was won or how baseball teams combine scout and sabermetric opinions. I’m pretty sure that Nate Silver would agree that his model would be improved by adding our data (or similar data from betting markets like Betfair that similarly thought that FiveThirtyEight was underrating Italy and Portugal) and vice versa. Still, even as I know that chance will play a big part in the outcome, I’m hoping Ranker data wins in this year’s world cup.

We all tell white lies now and then (yes you do, don’t lie!) but did you know that men and women lie for different reasons? The data from our list of Things People Lie About All the Time shows a pattern that may hint at this difference.

The poll lists 49 common lies and asks respondents to vote “yes” if they’ve lied about that in the past 6 months or “no” if they have not. According to votes cast by over 350 people, women are more likely to lie about things that “keep the peace socially” while men are more likely to lie over matters of “self-preservation.”

On the list, women are 8 times more likely than men to lie about “being too swamped to hang out” and 4 times more likely to claim that their “phone died.” These results imply that women may be more likely to feel guilty about canceling on friends or having alone time.

In contrast, men were 2 times more likely to admit to saying things like “Oh yeah! That makes sense!” when they did not understand something and 5 times more likely to say, “No officer, I do not know why you pulled me over,” when, presumably, they did know why. These types of lies could point to men’s desire to show themselves in the best possible light and cover up wrongdoing.

Differences aside, both men and women voted similarly on many items on this list. In fact, the top 3 most popular lies were the same for both men and women.

The Top 3 Lies for BOTH Men and Women Are:

1. I’m Fine

2. I’m 5 Minutes Away

3. Yeah, I’m Listening.

Which goes to show that men and women may be able to see eye-to-eye after all… just as long as they don’t ask each other how they are doing, where they are and whether or not they are listening.

The North American market for films totaled about US$11,000 million in 2013, with over 1300 million admissions. The film industry is a big business that not even Ishtar, nor Jaws: The Revenge, nor even the 1989 Australian film “Houseboat Horror” manages to derail. (Check out Houseboat Horror next time you’re low on self-esteem, and need to be reminded there are many people in the world much less talented than you.)

Given the importance of the film industry, we were interested in using Ranker data to make predictions about box office grosses for different movies. The ranker list dealing with the Most Anticipated 2013 Films gave us some opinions — both in the form of re-ranked lists, and up and down votes — on which to base predictions. We used the same cognitive modeling approach previously applied to make Football (Soccer) World Cup predictions, trying to combine the wisdom of the ranker crowd.

Our basic results are shown in the figure below. The movies people had ranked are listed from the heavily anticipated Iron Man 3, Star Trek: Into Darkness, and Thor: The Dark World down to less anticipated films like Simon Killing, The Conjuring, and Alan Partridge: Alpha Papa. The voting information is shown in the middle panel, with the light bar showing the number of up-votes and the dark bar showing the number of down-votes for each movie. The ranking information is shown in the right panel, with the size of the circles showing how often each movie was placed in each ranking position by a user.

This analysis gives us an overall crowd rank order of the movies, but that is still a step away from making direct predictions about the number of dollars a movie will gross. To bridge this gap, we consulted historical data. The Box Office Mojo site provides movie gross totals for the top 100 movies each year for about the last 20 years. There is a fairly clear relationship between the ranking of a movie in a year, and the money it grosses. As the figure below shows, a few highest grossing movies return a lot more than the rest, following a “U-shaped” pattern that is often found in real-world statistics. If a movie is the 5th top grossing in a given year, for example, it grosses between about 100 and 300 million dollars. if it is the 50th highest grossing, it makes between about 10 and 80 million.

We used this historical relationship between ranking and dollars to map our predictions about ranking to predictions about dollars. The resulting predictions about the 2013 movies are shown below. These predictions are naturally uncertain, and so cover a range of possible values, for two reasons. We do not know exactly where the crowd believed they would finish in the ranking list, and we only know a range of possible historical grossed dollars for each rank. Our predictions acknowledge both of those sources of uncertainty, and the blue bars in the figure below show the region in which we predicted it was 95% likely to final outcome would lie. To assess our predictions, we looked up the answers (again at Box Office Mojo), and overlayed them as red crosses.

Many of our predictions are good, for both high grossing (Iron Man 3, Star Trek) and more modest grossing (Percy Jackson, Hansel and Gretel) movies. Forecasting social behavior, though, is very difficult, and we missed a few high grossing movies (Gravity) and over-estimated some relative flops (47 Ronin, Kick Ass 2). One interesting finding came from contrasting an analysis based on ranking and voting data with similar analyses based on just ranking or just voting. Combining both sorts of data led to more accurate predictions than using either alone.

We’re repeating this analysis for 2014, waiting for user re-ranks and votes for the Most Anticipated Films of 2014. The X-men and Hunger Games franchises are currently favored, but we’d love to incorporate your opinion. Just don’t up-vote Houseboat Horror.

The 13 Craziest Deaths Caused by Social MediaEveryone is guilty of doing their fair share of internet sleuthing, stalking, creeping, and… killing? Well, at least these 13 people are. Read about the weird, effed up things people are doing through social media these days.

Men and women were equally likely to vote on items on this list (each gender averaged six votes per user), but women were twice as likely to be affected by sexual violence toward women, including Viserys’ lude treatment of his sister Danerys or The Red Wedding, which included the stabbing of a pregnant woman, than were men. In contrast, men were made most uncomfortable by hints of homosexuality (Loras and Renly shaving each other’s chests), being seven times more likely to find this scene uncomfortable. These patterns are convergent with research on mirror neurons, which indicate that people are most likely to be made uncomfortable by situations that threaten their self-identity, as well as accounts of women being driven to stop watching the show, due to the prevalence of depictions of violence against women.

None of these findings are carefully controlled trials, so these patterns could have many explanations. However, all research methods have flaws, and I would argue that it is the convergence of real world behavior with academic research that leads to true understanding. Given Ranker’s new emphasis on Game of Thrones related content (like our Ranker of Thrones Facebook page if you’re a fan), more analyses of the repeated moral ambiguity in Game of Thrones are forthcoming and I would welcome new hypotheses to test. What would you expect men/women to agree or disagree on? Older vs. Younger fans? West coasters vs. East Coasters?