Exercise

Goodness of Fit Test

The null hypothesis in a goodness of fit test is a list of specific parameter values for each proportion. In your analysis, the equivalent hypothesis is that Benford's Law applies to the distribution of first digits of total vote counts at the city level. You could list write this as:

$$
H_0: p_1 = .30, p_2 = .18, \ldots, p_9 = .05
$$

Where \(p_1\) is the height of the first bar in the Benford's barchart. The alternate hypothesis is that at least one of these proportions is different, that the first digit distribution doesn't follow Benford's Law.

In this exercise, you'll use simulation to build up the null distribution of the sorts of Chi-squared statistics that you'd observe if in fact these counts did follow Benford's Law.

Instructions

100 XP

Select the first_digit column from the iran data and use table() to create the one-way table of counts of each digit and save it to tab.

The proportions corresponding to Benford's Law can be found in a vector called p_benford. Check to see that they are a valid distribution by checking to see that sum to 1. Then use your tab and these proportions to calculate the observed Chi-squared statistic and save it as chi_obs_stat.

Construct a null distribution with 500 samples of the Chisq statistic via simulation under the point null hypothesis that the vector of proportions p is p_benford. Save the resulting statistics as null.