Probabilities and P-Values

P-values have been an issue for statistician for an extremely long time. I’ve seen situations where entire narratives are written without p-values and only provide the effects. It can also be used as a data reduction tool but ultimately it reduces the world into a binary system: yes/no, accept/reject. Not only that but the binary threshold is determined on a roughly subjective criterion. It reminds me of Jon Stewart’s recent discussion “good thing/bad thing“. Taking the most complex of issues and reducing them down to two outcomes.

Below is a simple graph (nothing elaborate) that shows how p-values alone don’t tell the whole story. Sometimes, data is reduced so much that solid decisions are difficult to make. The graph on the left shows a simulated situation where there are identical p-values but very different effects. The graph on the right shows where the p-values are very different, and one is quite low, but the simulated effects are the same.

P-values and confidence intervals have quite a bit in common and when interpreted incorrectly can be misleading. Simply put a p-value is the probability of the observed data (e.g. ), or more extreme data, given the null hypothesis is true[1, 2, 3, see also Moore’s book The Basic Practice of Statistics, 2nd ed. p 321-322].

Ultimately, I could write a fairly lengthy discussion, and there are several others (e.g. Gelman), on the topic. However, for this blog post I’ll highlight this one idea that with p-values it’s possible to get differing effects with the same p-values and differing p-values with the same effect. In the example below I opted for extreme data to highlight the point. Here’s a quick matplot of the example…

In conclusion, P-values are useful and can be useful. But we sometimes need to look beyond the test statistic and p-value to understand the data. Below is some code to simulate some extreme datasets.