All posts tagged statistics

One standard for describing a finding as “statistically significant” — that is to say, unlikely to be the result of chance — is something called the “p-value.” A p-value of .05, for instance, a standard often referred to in the social sciences, means that if you assume the tested hypothesis is false the odds that the results occurred by chance is under 5%.

A new paper in the Quarterly Journal of Experimental Psychology finds a suspicious number of papers in “three highly regarded [psychology] journals” reporting p-values that just meet the threshold of .05:

p values were much more common immediately below .05 than would be expected based on the number of p values occurring in other ranges. This prevalence of p values just below the arbitrary criterion for significance was observed in all three journals. We discuss potential sources of this pattern, including publication bias and researcher degrees of freedom.

To be clear, the suggestion is not fraud but something more akin to “finding what you really want to find.”

Uri Simonsohn, the researcher who flagged up questionable data in studies by social psychologist Dirk Smeesters, has revealed the name of a second social psychologist whose data he believes to be suspiciously perfect.

The professor, Yong reports, resigned his position at the University of Michigan in May. Such incidents, Simonsohn insists, should be seen not as a sign of weakness in social psychology, but of strength: “We in psychology are actually trying to fix things,” he tells Nature. “It would be ironic if that led to the perception that we are less credible than other sciences are. My hope is that five years from now, other sciences will look to psychology as an example of proper reporting of scientific research.”

The Michigan professor’s research fit a pattern in that it was counterintuitive and media-friendly: He purported to demonstrate, among other things, “that people behave more altruistically if they are physically elevated, for example by riding an ascending escalator.”

Last fall, when I wrote in the Chronicle of Higher Education about a Dutch psychologist who was found to have been committing research fraud, and noted that a growing number of psychologists were raising questions about some of the statistical techniques used by their peers (techniques that, to be clear, allegedly led to findings driven by wishful thinking, not outright fraud), one response from commenters was: Why pick on psychology?

Indeed, statistical problems bedevil many fields. One noted biostatistician has suggested that as many as half of all published findings in biomedicine are false. Given large, complex datasets, and freedom to choose the parameters of a given study, there is a lot of room for error to creep in when you really want to find a result.

But now, as BPS Research Digest reports, in a useful summary of the case, yet another Dutch social psychologist, Dirk Smeesters, has been found guilty by his university of dubious “data selection” and failing to maintain adequate research records. And so the focus is on social psychology once again. Questions about Smeesters’ work were originally raised by Uri Simonsohn, a Wharton School psychologist who was —not coincidentally —co-author of a key recent paper outlining methodological problems in his field, “False-Positive Psychology.” That paper demonstrated how easy it was to produce manifestly false but “statistically significant” results. I described it in that November article (and it was summarized, earlier and more fully, in a post by Wray Herbert):

In “False-Positive Psychology,” the authors

“demonstrated” that listening to the Beatles’ “When I’m 64″ made the test subjects literally younger, relative to when they listened to a control song. Crucially, the study followed all the rules for reporting on an experimental study. What the researchers omitted, as they went on to explain in the rest of the paper, was just how many variables they poked and prodded before sheer chance threw up a headline-making result—a clearly false headline-making result.

The authors of that provocative paper, who also included Joseph P. Simmons, of the University of Pennsylvania, and Leif D. Nelson of the University of California at Berkeley, wrote: “Many of us” — and they included themselves — “end up yielding to the pressure to do whatever is justifiable to compile a set of studies that we can publish. This is driven not by a willingness to deceive but by the self-serving interpretation of ambiguity.” …

A new study underscores that people will pay for “transparently useless advice” about chance events.

So just how gullible are we, when it comes to “expert” advice?

Economists have been telling people for ages that most stock-picking tips—even from people who make millions for their “prowess”—are useless: After a mutual-fund manager beats the market for a couple of years, investors will beat a path to his or door. But such past performance almost never predicts future performance, as the investors soon notice — before, that is, they shift their money to the latest fund manager on a hot streak.

Two researchers took it upon themselves to demonstrate, once and for all, how willing people are to pay “for what can only be described as transparently useless advice.” They gave nearly 400 people, in Bangkok and Singapore, tokens representing money, and invited them to bet on five random coin flips.

Before each flip, each participant got the chance to pay to read a written prediction of what the result of the next toss would be. If they didn’t purchase the “tip,” they still got to read the prediction, afterward, free of charge.

Obviously, the chances of the “expert” advice being right, one time, is —wait for it—50%. Still, when the advice was correct after the first coin toss, roughly 12% of players paid for advice in the second round (compared to roughly 4% if the prediction was wrong). Two correct predictions in a row inspired about 20% of players to buy a prediction for round three. The proportion of people buying bets continued to climb until, after four correct predictions, more than 40% of players were paying for advice for round five.

As Tim Harford notes, the odds of four random predictions correctly guessing four coin flips “is a not-exactly-stunning one in 16″ — yet people were clearly persuaded they were getting hot tips.

A study conducted at the University of Zurich and published in the Annals of Epidemiology suggests that people are more likely to die on their birthdays than on other days. Or, as one of the study’s authors puts it, drily, “Birthdays end lethally more frequently than might be expected.”

data from more than two million people over the age of 40. [The study] showed that people are 18.6 percent more likely to die of heart attack on their birthdays, while the risk of stroke shoots up by 21.5 percent. There was also a 34.9 percent rise in suicides and a 44 percent rise in deaths from falls on birthdays.

It’s a truism of enthusiast car magazines that traffic tickets have little effect on road safety — that towns that issue many tickets are mostly interested in raising money, and that most people drive prudently given the conditions (if not always wholly legally). A newly published study suggests that towns may well have income in mind when orders trickle down to cops to hand out more tickets, but that the effects on safety of such policy decisions are significant:

Portrait of Franz Anton Mesmer, who discovered ”animal magnetism” or mesmerism, a form of ESP. By W. Pallissot.

So here’s a profile I’ve been waiting for — of Daryl Bem, the Cornell psychologist who published an article, in 2010, purporting to demonstrate the validity of some forms of psychic foresight.

It turns out that Bem was a mentalist in high school, fully aware that his craft relied on making the audience think that his deductions about them, “based on tangible clues,” were the result of magic. Later, as a professor at Carnegie Mellon, Stanford, and then Cornell, he used to perform a version of the act on the last day of the course:

“It was a pure swindle,” Bem says … At the end of the show, he always revealed the truth. The experiment had nothing to do with ESP; the point was to demonstrate that the students shouldn’t always trust their intuition.

One day a University of Pittsburgh researcher who was studying psi—an umbrella term for telepathy, precognition, and other seemingly paranormal phenomena—came to see Bem perform. Amused as he was by the show, he thought Bem was being unfair to psi. “He said, ‘Daryl, I don’t think you know the psi literature,’!” Bem recalls. “So he sent me a bunch of stuff to read.” Bem was skeptical but fascinated.*

Statistically savvy commentators are applauding this summary of the latest U.S. jobs report, by the Economist’s Ryan Avent, who takes things like “confidence intervals” seriously:

Today, the Bureau of Labour Statistics released its March employment report. There is a 90% chance that employment rose by between 20,000 and 220,000 jobs. The change in the number of unemployed from February to March was probably between (roughly) -400,000 and 150,000, and there’s a good chance that the unemployment rate is between 8.1% and 8.5%. Reported changes for important subsectors are too small relative to the margin of error to be worth discussing….

Researchers at UCLA have combined neuroscience and math in order to identify patterns in the acts of a notorious Russian serial killer, executed in 1994. The model makes several assumptions: 1) a certain threshold of neuronal firing has to be passed, at which point the desire to kill becomes impossible to ignore; 2) some time is subsequently required in order to plan and carry out a killing; 3) killing acts as a sort of sedative, calming the neurons temporarily.

When they compare the patterns of killing predicted by such a model to the murders carried out by Andrei Chikatilo, who confessed to 56, according to this post, the fit was striking. There were several corollaries to the findings. For example,

the likelihood of another killing is much higher soon after a murder than it is after a long period has passed.

That’s a well known property of power law distributions that holds true for all kinds of phenomenon. A large earthquake, for example, is more likely soon after another large earthquake.

Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a team that tried, but failed, to replicate those results. Here, he tells the Chronicle of Higher Education’s Tom Bartlett about the difficulties he’s had getting the results published.

Several journals told the team they wouldn’t publish a study that did no more than disprove a previous study. (This is an example of what researchers call “publication bias” — positive results get published, negative ones don’t. Man-bites-dog findings also face an uphill climb, and few findings could be less surprising than that we can’t see the future.) An editor at another journal said he’d

only accept our paper if we ran a fourth experiment where we got a believer [in ESP] to run all the participants, to control for … experimenter effects.

Biographies

Gary Rosen is the editor of Review and the former managing editor of Commentary magazine. His articles and reviews have appeared in the Wall Street Journal, New York Times, Washington Post, and Los Angeles Times. He is the author of "American Compact: James Madison and the Problem of Founding" and the editor of "The Right War? The Conservative Debate on Iraq."