Sunday, March 21, 2010

Is all statistical analysis wrong?

It's another article along the well-worn lines of p-values being misinterpreted (as the probability of the null hypothesis being true), which I have blogged about ad nauseam (it's not that narrow though and talks about multiple comparisons too). It also extensively quotes Steve Goodman, and you can find me doing the same here. Therefore, it should hardly be a surprise to hear that I basically agree with the article, although perhaps the message is perhaps a bit overstated and hyperbolic in some places: "in practice, widespread misuse of statistical methods makes science more like a crapshoot." For some other more nuanced and generally well-informed views on it, see here (and the links in the comments there too).

One point that the opponents of p-values sometimes make is to claim that confidence intervals solve the problem that p-values have (by providing an interval within which the unknown parameter probably lies). However, this is not actually true, and they are also equally open to misinterpretation. As it happens, I've recently written an article which touches on this point, which can be found here.

Having said that, I still use p-values when it suits me. I do try to make sure I don't misrepresent what they mean though...

David, that is the full link (right-click to copy). I don't think it is on the Wiley site.

Jesús, thanks for the link. I see that the angle people take on it depends to some extent on where they are coming from :-) Clearly it doesn't mean that all (or even most) science is bogus, and in fact in practice the p-value is a useful guide. However, it shouldn't be overinterpreted.

(This can probably be generalised, that all "clever" methods and rules can be misused and don't necessarily provide a hotline to the truth.)

So when the denialists (willfully?) misinterpreted Phil Jones' comments to the BBC on warming since 1995, does this mean that a rational critique of what he said would've said he should have used Bayesian reasoning instead?

Well I suppose you could say that, though perhaps it's more a case of the futility of using such a small set of data that it was clear at the outset that it would be insufficient to draw any meaningful conclusions. An intelligent frequentist or Bayesian analysis could demonstrate this point equally well, but that wouldn't prevent someone misrepresenting them!