formerly cdixon.posterous.com

The most interesting thing about statistics

Suppose you want to poll people in a country where 1 million people will vote, and then do the same in a country where 1 billion people will vote. How many more people do you have to poll in the second country to get the same polling accuracy as you do in the first?

The counterintuitive answer: you need to poll the same number in each country to get the same accuracy. Polling error is a function of sample size but independent of the size of the whole (voting) population.

**

The intuition behind this is that there is some “true” underlying probability you are trying to sample that is independent of population size. So suppose you are trying to figure out the odds that humans on average have boy versus girl babies. You take a random sample of 1,000 people to estimate it. Does it matter that the sample is out of a population on earth of 1 billion people on earth versus 6 billion people earth? The earth’s population is just an instantiation of the true underlying probability you are trying to determine, hence irrelevant.