04 August 2017

A detailed discussion of p values

An article in Vox will be of interest primarily to readers who have had a manuscript rejected (or have reviewed and rejected one) because a crucial p value was >0.05

Most casual readers of scientific research know that for
results to be declared “statistically significant,” they need to pass a
simple test. The answer to this test is called a p-value. And if your
p-value is less than .05 — bingo, you got yourself a statistically
significant result.

Now a group of 72 prominent statisticians, psychologists,
economists, sociologists, political scientists, biomedical researchers,
and others want to disrupt the status quo. A forthcoming paper in the journal Nature Human Behavior argues that results should only be deemed “statistically significant” if they pass a higher threshold.

“We propose a change to P< 0.005,” the authors write.
“This simple step would immediately improve the reproducibility of
scientific research in many fields.”...

The proposal has critics. One of them is Daniel Lakens, a
psychologist at Eindhoven University of Technology in the Netherlands
who is currently organizing a rebuttal paper with dozens of authors. Mainly, he says the significance proposal might work to stifle scientific progress.

How many statisticians does it take to ensure at least a 50 percent
chance of a disagreement about p-values? According to a tongue-in-cheek
assessment by statistician George Cobb of Mount Holyoke College,
the answer is two … or one. So it’s no surprise that when the American
Statistical Association gathered 26 experts to develop a consensus
statement on statistical significance and p-values, the discussion
quickly became heated.

It may sound crazy to get indignant over a scientific term that few
lay people have even heard of, but the consequences matter. The misuse
of the p-value can drive bad science (there was no disagreement over
that), and the consensus project was spurred by a growing worry that in
some scientific fields, p-values have become a litmus test for deciding
which studies are worthy of publication. As a result, research that
produces p-values that surpass an arbitrary threshold are more likely to
be published, while studies with greater or equal scientific importance
may remain in the file drawer, unseen by the scientific community.

4 comments:

This is a good article; thanks for sharing. As the article says, this change would not solve any of the fundamental problems with using p-values to evaluate scientific merit. The American Statistical Association recently released a statement on p-values (available at http://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108?scroll=top&needAccess=true#_i27). The statement includes a few relevant points. Here are a few:"Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold."

"A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.

"By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis."

Changing the cutoff from 0.05 to 0.005 would only serve to emphasize the importance of obtaining a small p-value even more; we should move away from this mindset and towards estimates of effect size.

The other downside of this mindset is that it places a lot of social science outside the bounds of "legitimacy." Available samples for hard-to-reach populations are necessarily smaller than those for large survey samples, making p-values lower than .05 more difficult to obtain.

For example, I've been doing gang research. It'd be really difficult, and extremely expensive, to find a representative (e.g. non-convenience) sample of 1000 gang members, just to hit a higher threshold of p-value. Much of social science falls in this trap, because we work with underrepresented populations.

Being held to that high standard of p-values would derail the legitimacy of our findings under a pretense that doesn't fully understand the meaning of p-values.

"Tai-wiki-widbee" is an eclectic mix of trivialities, ephemera, curiosities, and exotica with a smattering of current events, social commentary, science, history, English language and literature, videos, and humor. We try to be the cyberequivalent of a Victorian cabinet of curiosities.

The 2008 Weblog Awards

Category: Best New Blog

Translate

Search TYWKIWDBI

About Me

I'm using an old photo of my grandfather as an avatar; he would have been amused.
Old friends, classmates, students, former colleagues, or distant relatives are welcome to email me via retag4726 (at) mypacks.net