Pages

Tuesday, February 5, 2013

Big Data vs. Intuition–Why Coexistence Is Important

David Brooks writing in the New York Times (2.5.13) reflects on the increasing ability for rational, objective analysis in the age of big data, and on how more and more assumptions that have been based on intuition or subjective observations are being debunked. He begins with one of the most famous examples of this data vs. intuition problems, the answer to which does not rely on big data, just basic statistics.

For example, every person who plays basketball and nearly every person who watches it believes that players go through hot streaks, when they are in the groove, and cold streaks, when they are just not feeling it.

But researchers have found that a player who has made six consecutive foul shots has the same chance of making his seventh as if he had missed the previous six foul shots.

When a player has hit six shots in a row, we imagine that he has tapped into some elevated performance groove. In fact, it’s just random statistical noise, like having a coin flip come up tails repeatedly. Each individual shot’s success rate will still devolve back to the player’s career shooting.

Relying on new data available concerning school performance, researchers have found that there is no basis whatsoever for the prevailing theory of multiple intelligences – that is, that children learn in different ways given their particular cognitive abilities. Some, the theory goes, will learn better pictorially, others aurally, etc. When subjected to hard measurement of performance for thousands of students, there was no statistical difference whatsoever between classical learning and alternative learning.

This contra-intuitive logic applies to just about every other area of inquiry. Although millions of dollars were spent on campaign ads during this last election, there was insignificant correlation between dollars spent and votes influenced.

Researchers have also discovered patterns which we have not yet noticed. For example, a study was done on the theory that egocentric people use the personal pronouns ‘I’, ‘me’, and ‘mine’ more than those more well-adjusted. After careful, extensive, and patient inquiry, the researchers found no correlation whatsoever. In fact the opposite was true:

But as James Pennebaker of the University of Texas notes in his book, “The Secret Life of Pronouns,” when people are feeling confident, they are focused on the task at hand, not on themselves. High status, confident people use fewer “I” words, not more.

Pennebaker analyzed the Nixon tapes. Nixon used few “I” words early in his presidency, but used many more after the Watergate scandal ravaged his self-confidence. Rudy Giuliani used few “I” words through his mayoralty, but used many more later, during the two weeks when his cancer was diagnosed and his marriage dissolved. Barack Obama, a self-confident person, uses fewer “I” words than any other modern president.

What are we to make of all this? Are intuition and subjective judgment out the window? In the movie Moneyball, a grizzled baseball scout, a veteran of decades of viewing young prospects and judging physical ability, character, desire, will, resilience, reaction to adversity, etc. is dismissed by Billy Beane. “Getting on base is all that matters”, he says, backing his conviction up with statistical data.

There is no doubt that Beane was right, for when subjected to big data analysis, time and time again reaching first base is the ultimate key to runs, and runs mean wins. At the same time, the scout was also right, for the new player recruited only for his numbers might turn out, after a few seasons or a few months, to have a nasty streak, a malevolence that quickly infects team cohesion and moral. He might produce runs, but team performance over time might suffer and the number of runs might well diminish. While it is true that Billy’s Beane’s analysis was correct - when analyzed collectively on a large scale teams that have a high on-base percentage score more runs – it also misses the micro-scale implications of adhering absolutely to numbers.

While it is true that the laws of probability apply to basketball players’ shooting percentages – i.e. that in the long run any one player’s accuracy will always revert to his historical average – there is no doubt that streaks do exist. Players in all sports talk about being in a zone where everything but scoring is tuned out. When in a zone they say they can’t miss. The see a line drawn from putter to hole, or an arc from hand to basket. Their concentration, focus, and perceptive abilities are perfectly attuned to the situation. Golfers can read the green with uncanny accuracy, baseball players can see the rotation of the ball as clearly at 100 mph as at 60. Is there any doubt that if a basketball player is on such a streak or in such a zone, that his teammates will look to pass to him first? Would any baseball manager with a player on a hitting streak treat him the same as a mediocre hitter who always hits for the same average? If a child is encouraged to learn in a way that is more congenial to him, then although his number-based performance may not improve, but his attitude towards school and learning might.

In other words both intuition and big data analysis can and should co-exist. Moreover, intuition drives big data. Powerful algorithms designed to study a particular phenomenon are not products of spontaneous generation. Someone had to suspect something going on that would be worth looking into. While the ego-pronoun analysis turned up nothing in the way of significant correlation, it was certainly worth looking into, and maybe researchers picked up some other clues to linguistic-personality behavior which can be subjected to rigorous analysis later.

Finally, no amount of statistical correlation or objective analysis will ever convince the millions of conspiracy theorists who, despite the fact that copious data show that Obama was born in Hawaii, believe that he is an alien. Policy makers and programmers routinely dismiss these fringe groups and their irrational claims; but their very presence is sobering, for it suggests limits on objectivity and numbers-based analysis. Changing the minds of doomsday believers or even those who stubbornly resist ‘the plain facts’ will require a very different tack – an approach which somehow enables entry into this irrational world and effects change using the tools found there.

My beloved aunt, who had advanced Alzheimer’s, died recently at 93. She made no sense whatsoever and claimed that the Pope had visited her in her small Connecticut convalescent home, that the facility staged male strip shows for the female residents, and that an impenetrable jungle existed just beyond the parking lot. I gave up early on trying to convince her that her visions were unreal and fictitious; and decided to enter her world. Once I did, I could discuss what the Pope was wearing, whether or not she had seen any wildebeests on the parking lot, and if she got turned on by the strippers. It was a crazy world in which we travelled for the hour or so of my visits, but I found my aunt there, unchanged, and just as funny and engaging as she had always been. There is more than one plane of reality.

No discussion of big data and intuition can be complete without discussing religion. Although more and more Americans profess non-belief, the United States is overwhelmingly religious, taking on faith Biblical stories which to an outsider seem to stretch the limits of plausibility far beyond my aunt’s demented visions. Yet, we don’t challenge these intuitive people and demand objective proof from them. Even the great Christian logicians, Augustine, and Thomas Aquinas, who were the founders of the very logical basis of Catholic belief, had to stop at a point and admit that it all was, after all, a matter of faith (intuition).