Ignoring the Wisdom of Crowds

JellybeansIn 2007 Michael Mauboussin presented a big jar of jelly beans to his seventy-three Columbia Business School students. How many beans did they think it contained?

Guesses ranged from 250 to 4,100; the actual number was 1,116. The average error was 700 — a massive 62% — demonstrating that the students were awful estimators.

Now here comes the weird part. Even with all these wildly incorrect guesses, the average guess was 1,151 — just 3% off the mark. Not only that, only 2 of the 73 students guessed better than this group average.

So although individually everyone was woefully inaccurate, collectively the group was incredibly accurate.

Was this a fluke? Hardly. The experiment was made famous in 1987 by Jack Treynor. In his case it was 850 jelly beans and 56 students. The group estimate was 2.5% off; only one student guessed better. The study has been repeated many times since with similar results.

This eerie effect goes beyond jelly beans; it’s also a big help when you’re trying to make money on TV.

The best multiple-choice test, ever
A contestant on the game show Who Wants to be a Millionaire can win a million dollars if she answers fifteen consecutive multiple-choice questions. If she’s stumped along the way she has three “life-lines:” (1) eliminate two of the four choices, (2) telephone a friend, or (3) poll the audience. The jelly bean experiments imply that this third choice might be pretty good. Is there as much wisdom in the crowd for pop culture and science as there is in counting jelly beans?

The TV studio audience predicts the correct answer an astonishing 91% of the time. Remember, these are questions from all domains of knowledge, all ranges of difficulty, polling a group of people who happened spend the (weekday) afternoon in a TV studio.

To quantify how amazing that is, compare with the accuracy of the “phone a friend” life-line where the contestant gets 30 seconds with a pre-determined person. This accomplice is probably considered to be “the smartest person I know,” plus undoubtedly has access to the web of lies Google and Wikipedia.

The intelligent friend with broadband access to the entirety of human knowledge gets it right only 65% of the time.

Crowd wins again.

Is the rule universal?
There’s seemingly no end to studies like these, all showing that the crowd is smarter than the individual. Is this a universal rule? Should we be leveraging this power more often?

Big companies do use crowd wisdom. You always hear about advertising campaigns being honed by focus groups of “real people.” (I’d like to see the questionnaire that distinguishes “real people” from that elusive other kind of person.)

However, company messaging, product features, advertising layouts, and the other creative aspects of business require innovation, and we know that design-by-committee is the antithesis of innovation. Average products designed for the average consumer is the path to small business failure.

So what should we do? Can we rely on the wisdom of the collective or should we trust a stroke of inspiration?

Analysis of how “crowd wisdom” works
Let’s take another look at Who Wants to be a Millionaire.

Suppose there are 100 people in the audience and only 16 of them know that “A” is the correct answer. Of the rest, none knows the answer and they vote randomly. The result of the vote will be: 37, 21, 21, 21:

Oh gee, it’s awfully similar to the graphic earlier of a real audience poll.

(For those of you so inclined, it’s fun to try more complex scenarios, although you’ll find the result is always similar. For instance, what if only 11 know the answer is A, 15 each know that B, C, or D are certainly not the answer (and vote randomly for the other three), and the remaining 44 have no clue and vote randomly. In this scenario, the vote distribution is the same as in the given example!)

So we have the interesting result that a mere 16% of the voters were able to make choice A the clear winner — nearly double the next closest answer. The reason? The ignorant people vote randomly and their votes cancel out, leaving the rest in control of the result.

The crowd vetoes innovation
Now that we understand how crowds can be right, let’s see why this same process doesn’t work for creative endeavors.

Eventually you realize there’s only way to please everyone: Cook something bland, mild, and safe, like chicken and rice. But does chicken and rice actually please anyone? Not really, it was just what everyone hated the least.

Of course if you’re a catering company for weddings, chicken and rice might be the way to go! After all, no one goes to weddings for the food, so your primary goal is to piss off as few of the 300 guests as possible. Come to think of it, chicken and rice does seem to be popular at those sorts of functions…

But this isn’t a strategy for startups. Little companies need a niche — a market space they can completely, unquestionably own, not some gray middle-ground where your attempt to offend no one also means exciting no one.

There is “wisdom in the crowd” only when errors cancel out, like when estimating jelly beans or answering pop culture questions. In creative work, votes eliminate the interesting edges, leaving only the boring residue that no one hated enough to vote off the island.

That’s not how great products are made.

What do you think? When should you follow consensus and when should run off and trust your guy? Join the conversation and leave a comment!

Further Reading

James Surowiecki’s The Wisdom of Crowds, with more stories and implications for Wall Street. He also writes a terrific financial column and blog for The New Yorker.

Scott Page’s The Difference, explaining how diversity makes a group smarter. The inspiration for my Who Wants to be a Millionaire example.

Thanks Kathy! Nice point about why this means you can’t use conventional wisdom to set your business plan.

Of course some new ideas really are bad, and for reasons "obvious" to those with experience. I guess the problem is that no one knows which is which!

http:delightfulwork.com Tom Volkar / Delightful Work

This made me think about the wisdom shared by Earl Nightengale considered to be the founder of the audio self-improvement industry. Earl always said, "Look around and see what everyone else is doing and do exactly the opposite."

Your title got me to read this post and the jelly bean studies are fascinating. But I’ve always bucked the crowd and I see no reason why not to now after reading.

However I may be able to use this info when facilitating teams. It could have some interesting implications.

Jason

@Tom — I love that quote. If nothing else, at least you’ll be unique! Uniqueness can get you a long way.

Ratliff

I think group mind and its various manifestations work well when the objective is, well, objective. In other words, the only reason we can look at a graph of how well a group answers a question is because we know the correct answer. But when you’re trying out new ideas, there is no "right" answer, and therefore the group consensus is really just one more (big) opinion.

Furthermore, even if you’re trying to get a sense of people’s preferences, humans aren’t very good at predicting what will make them happy (cf. Daniel Gilbert’s work). I’d be willing to bet that prior to the introduction of the iPhone, a significant number of people would have vetoed the idea of messaging and texting without a physical QWERTY keyboard. But once they’ve experienced the interface, they modify their view. Innovation can’t completely discount people’s past experiences, but it has to look past conventional wisdom about what people do and don’t want.

(Further evidence of the idiocy of people, or at least this person: I actually paused to try and remember the correct spelling of "QWERTY" just now.)

http://www.spiritualpreneurs.com/ Sharon Wilson

Ingenious observation. This goes to show that conventional thinking is not the key to successful business. Doesn’t this just translate to being able to "think outside the box".

Jason

@Ratliff — I like your connection to group-mind… as in when group-mind goes wrong! Although group-estimation and group-mind are of course separate things. It’s true that knowing what’s "right" is important; I would go further and say the "average" works only when there is one correct answer, and when the task is "zero in on the answer" rather than "come up with new ideas."

@Sharon — I agree it’s akin to "think outside the box," but perhaps this is a more specific way to do so. That is, it means that groups by definition don’t think outside the box; that individual thought is needed for that.

But is that completely true? Brainstorming is a good way to generate new ideas and of course works best in a group. Perhaps the right idea is that brainstorming generates ideas but flashes of insight identifies great ideas? I’m not sure…

http://mp3amore.com/ musicisall

the crowds can be right or wrong, but its all about you i guess…..musicisall

http://www.marketingprofs.com Allen Weiss

Thanks for the interesting post. I imagine this all depends on context and specifics. When making judgments about mathematical outcomes (as in the experiment), it makes sense that the crowd could converge on the number. Trying to extend this to other contexts, however, seems incredibly risky. When people discount, say, small focus groups, rather than the crowd, they are forgetting about how the focus group a) might have been biased when they were originally chosen, b) that the questions were asked poorly, etc, etc.

It seems people like an easy way out of thinking hard and clear about things, and are looking for an easy solution – like just letting the crowd decide. Its easier, for sure, but often a lot more risky.

http://johnleach.co.uk John Leach

I can’t find details of the Mauboussin jelly bean experiment, but I’d expect he actually used the median of the guesses, not the average.

Using the average gives each "voter" too much power over the final result. The median gives each voter one vote, as it should be.

This same mistake was made in the Wisdom of Crowds book when describing Francis Galton and the ox weighing.

http://www.aboogy.com Jure

Cmo’n. You have to have everything on your wedding. It’s wedding, happy day, you want to share that happiness with your friend and relatives. You don’t cut cost there. I have read about some experiment with 500 students. They were standing in few rows and on the screen at the front there were three lines. Line B was obviously shorter and anybody could see it. First half of the students were told to say that line B is the longest one and they were answering first saying loud in microphone – LIne B. Other part of the students say that line B is the longest one just because they heard lot’s of people saying – Line B is the longest. You can’t believe it. People don’t trust their eyes but crowd. Great post.

g_face

This is quite interesting, however I feel that the Who Wants to be a Millionaire comparison is misleading. The audience knows the answer because they are usually only asked when there is a populist question to which the contestant happens not to know the answer. I.e. He doesn’t follow soap operas, but he’s asked the name of someone – clearly he’s going to ask the audience. Such tactics occur episode after episode. Stats from Ask The Audience are simply not applicable to this, otherwise well founded, observation.

Jason

@John — I took the Mauboussin study from multiple sources in researching this post. All specifically said "average," however I also admit that none of those pointed out that the ox-weighing was median; I didn’t know that either (and yes I’m familiar with that work).

So you may have a point! I think the thrust of the argument — that marketing and creative work generally ought to be the product of "what’s best for me" rather than taking a vote — is still valid. But thanks very much for pointing out a potential factual error.

If you can confirm it, please let me know so I can correct it.

@Jure — ha! Of course you’re right about weddings; just an example I know about because my wife is a chef. :-)

@g_face — this question has actually been measured statistically. I’m sure you’re correct that the "ask" lifeline is used more often when it’s a "popular knowledge" question, and I do agree with you that this sullies the statistics. It would be nice to get the same stats from "non-popular" questions and see if it remains.