This is a place for Management and Technology posts.
Take part and enjoy it!

Loading...

Tuesday, November 20, 2012

Three Things the P-Value Can't Tell You about Your Hypothesis Test

Statistics can be confusing, especially when you look under the hood at
the mathematical engines that underlie it. That's why we use statistical software to do so much of the work for us, and why we use tools like p-values to help us make sense of what our data are saying.

The
p-value is used in basic statistics, linear models, reliability,
multivariate analysis, and many other methods. It's a concept every
introductory statistics student and every Lean Six Sigma Green Belt
learns at the start. But it's frequently misinterpreted.

Andrew Gelman, director of the Applied Statistics Center at Columbia University, wrote a blog post
that contains (amidst other interesting discussion) a good explanation
of what a p-value is and, probably even more important, is NOT:

"A p-value is the probability of seeing something as extreme as was observed, if the model were true."

In
hypothesis testing, when your p-value is less than the alpha level you
selected (typically 0.05), you'd reject the null hypothesis in favor of
the alternative hypothesis.

Let's say we do a 2-sample t-test to
assess the difference between the mean strength of steel from two mills.
The null hypothesis says the two means are equal; the alternative
hypothesis states that they are not equal.

If we get a p-value
of 0.02 and we're using 0.05 as our alpha level, we would reject the
hypothesis that the population means are equal.

But here are three things we can't say based on the p-value:

"There is 2% probability no difference exists, and 98% probability it does." In
fact, the p-value only says that IF the null hypothesis were true, we
would see a difference as large or larger than this one only 2% of the
time. If this seems confusing, just keep in mind that the p-value
doesn't tell you anything directly about what you're observing, it tells you about your odds of observing it.

"Since we have a low p-value, this difference is important." A
p-value can tell you that a difference is statistically significant,
but it tells you nothing about the size or magnitude of the difference.

"The p-value is low, so the alternative hypothesis is true."A
low p-value can give us a statistical evidence to support rejecting the
null hypothesis, but it does not prove that the alternative hypothesis
is true. If you use an alpha level of 0.05, there's a 5% chance you will
incorrectly reject the null hypothesis.

Does this mean that
quality practitioners and others shouldn't use p-values? Of course
not--the p-value is a very useful tool! We just need to be careful
about how we interpret the p-value, and particularly careful about how
we explain its significance to others.