How the chi-square goodness of fit test works

The null hypothesis is that the observed data are sampled from a populations with the expected frequencies. The chi-square test combines the discrepancies between the observed and expected values.

How the calculations work:

1.For each category compute the difference between observed and expected counts.

2.Square that difference and divide by the expected count.

3.Add the values for all categories. In other words, compute the sum of (O-E)2/E.

4.Use a computer program to calculate the P value. You need to know that the number of degrees of freedom equals the number of categories minus 1.

The null hypothesis is that the observed data are sampled from a populations with the expected frequencies. The P value answers this question:

Assuming the theory that generated the expected values is correct, what is the probability of observing such a large discrepancy (or larger) between observed and expected values?

A small P value is evidence that the data are not sampled from the distribution you expected.

The Yates' correction

When there are only two categories, some statisticians recommend using the Yates' correction. This would reduce the value of chi-square and so would increase the P value. With large sample sizes, this correction makes little difference. With small samples, it makes more difference. Statisticians disagree about when to use the Yates' correction, and Prism does not apply it.

With only two categories, it is better to use the binomial test, which gives an exact result instead of either form of the chi-square calculation, which is only an approximation.