Average Doesn't Always Exist

Up until now, I thought average always exists. We are dealing with average very often in real life: the average height of Japanese, the average rate of divorcing in US, and the average salary in France, etc. It seemed to me that if I take enough number of sample data, I can get average that reflects the reality very well.

Today I learned that this is not always the case. Average doesn’t always exist. Because the idea is very interesting to me, let me share the idea with you.

Average in Coin Toss Game

Let’s being with the case when average exists. I’ll use a very simple coin toss game. You flip a coin and you win $1 if the result is head. You win $0 or nothing if the result is tail. Very simple.

When you toss the coin multiple times, what’s the average of dollar you will win? You can get the average very easily by the following formula:

avg. = (total number of $ you won) / number of times you toss the coin

And you will see that the average is 0.5 because the chance of getting head is ½. But is it really so? Let’s confirm this by writing some code to simulate the game.

It looks like the average is approaching to somewhere around 2.5 but we are not fully sure yet. Let’s increase n and see if the average is really 2.5.

Here is the graph when n is 20 and you will probably be disappointed by looking at the graph: the average is approaching to 2.5 and then suddenly there is a spike around n=20 which pushes the average a way higher! You will also discover the pattern after n=20: the average is converging to a number but then you will see a spike shortly after the convergence.

Here is the graph when n is 10000. It looks like this time the average is approaching to somewhere around 12. Did we finally discover the average?

It turns out we didn’t! This is very clear when you increase n to 50000.

Again, there is a spike around 30000 and we lost the average again. You can increase n as much as you want and you will never find a good average in this game.

Why Average Doesn’t Exist?

We can see why the average doesn’t exist with simple math.

You can calculate the average of normal coin-toss game with the following formula.

1

ave.= (probabilityofH)x$1+ (probablyofT)x$0

Because both the probability of getting head (H) and tail (T) is 0.5, the average of dollar you will win becomes $0.5 which is what we’ve seen in the previous simulations.

Let’s compute the average for the coin-toss-until-head game. One big difference from the normal coin toss game is that
there are infinite numbers of different results of coin in this game. The result could be T H or T T H or T T T T T T H or whatever.

So we need to calculate all the possibilities. You can express this with the following formula.

The average that you got is ∞ (infinity). In other words, average doesn’t exist.

What’s the implication of the idea that average doesn’t always exist in real life?

Another way of interpreting these graphs without average is that sometimes a super extreme event could happen. Because the impact of the event is astronomical, it skews the average and make it useless. So, if the model that you are studying is similar to the coin-toss-until-head game, you will see the same pattern. In fact, this model is often used to study extreme events such as a sudden collapse in a stock market.

Summary

Hopefully, you will find this interesting. Time is running out, so I’ll stop here, but if you want to learn more, you can search “St. Petersburg paradox” or “Power Laws” to dive into the area deeper.