Teaching Glivenko-Cantelli

That, when data are iid, the empirical distribution function converges (uniformly) to the true distribution function is one of the fundamental theorems underlying statistics. I don’t know if we (meaning I) emphasise this enough in classes.

What do undergraduates need to know? They certainly don’t need to hear the words “uniform convergence” or “almost surely”. But I don’t think a few explicit or implicit words about laws of large numbers is enough to convince students that statistics works.

“Statistics works because if you gather enough data, you get the right answer.” We know this isn’t quite true, but how much more do we need to tell our undergraduates to avoid misleading them?