Subscribe to this blog

Follow by Email

Search This Blog

Winning the Lottery, Twice!

You might have heard this piece of news often "XYZ wins lottery for a second time". Such news, like here, leaves the reader wondering how someone can get that lucky. Needless to say, the winner is on cloud nine knowing s/he has won, not once but twice! But are these events truly rare? We are conditioned to believe that winning a lottery in itself is a rare event let alone winning it twice. Lets explore.

To begin understanding the rarity of the above event, lets revisit the Birthday Problem. The summary of the problem is we need an astonishingly small number of people in a room to have \(>50\%\) probability that two people would have the same birthday, about 23. To generalize, there are 365 days in a year. If there are 2 people in a room, the probability that both have different birthdays are \(\frac{364}{365}\). If there are 3 its \(\frac{364}{365}\times\frac{363}{365}\) and so on. If you assume there are \(N\) days in a year and there are \(r\) people we can compute the probability that they all have different birthdays as
$$
P(\text{different birthday}) = \frac{N}{N}\times\frac{N-1}{N}\times\frac{N-2}{N}\ldots\frac{N-r-1}{N}
$$
Implying the probability that at least two have the same birthday is \(1 - P(\text{different birthday})\)

These results can be extended to the double-win lottery phenomena as well. \(N\) can be mapped to the possible numbers in a lottery, \(r\) could be the number of lottery players. Let us make some simplifying assumptions.

The winners of a lottery would continue buying tickets and most lottery buyers who don't win continue to buy lotteries, and assume all lotteries are drawn weekly.

The lottery draw is done once a week.

It is the same set of \(r\) people who are buying every week and the winner is one of the \(r\)

Everybody buys one ticket which has a distinct number.

To compute the probability that in the second week the same person does not win, two things need to happen.

The winning number has be one of the \(r\), this happens with probability \(\frac{r}{N}\)

The winner in the first round should not be the winner in the second round. This happens with probability \(\frac{r-1}{r}\).

Thus, the overall probability of not having a double winner in the second week is \(\frac{r}{N}\times\frac{r}{N}\times \frac{r-1}{r} = \frac{r(r-1)}{N^2}\). For the 3rd week, by applying a similar logic, we get the probability that all three are distinct winners as

$$
P(\text{all 3 distinct}) = \frac{r(r-1)(r-2)}{N^3}
$$

Let us examine what happens over a decade. That is a total of 520 draws. Let us also put some real numbers behind this. Assume there are 10 million lottery players and there are 175 million numbers to choose from (these are rough estimates from Powerball numbers). The value of \(P(\text{all distinct})\) works out as
$$
\frac{10^{6}\times(10^{6} - 1)\times(10^{6} - 2)\ldots\times (10^{6} - 520)}{(175\times 10^{6})^{520}}
$$
The above expression is in-computable by any machine we know of, however we can easily find a maximum possible value for this expression. The numerator is clearly less that \(10^{6\times 520}\) and the denominator can be factored out as \(175^{6\times 520} \times 10^{6\times 520}\). This gives a maximum bound for the expression as
$$
\frac{1}{175^{520}} \approx 0
$$
This implies, the probability of there being a double winner over a decade, under the prevailing assumptions, is almost \(100\%\)!
Why then have we not heard of double winners in Powerball jackpots? The most likely reason is the winners of big lottery draws likely don't bother coming back and buying more. Also, from the point of view of the buyer the probability of a double win IS astronomically small, however the probability that there would be a double winner somewhere or the other (under the prevailing assumptions) is quite the opposite, it is a mathematical certainty!

If you are looking to buy some books in probability here are some of the best books to learn the art of Probability

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

A Course in Probability Theory, Third Edition
Covered in this book are the central limit theorem and other graduate topics in probability. You will need to brush up on some mathematics before you dive in but most of that can be done online

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

Linear Algebra (Dover Books on Mathematics)
An excellent book to own if you are looking to get into, or want to understand linear algebra. Please keep in mind that you need to have some basic mathematical background before you can use this book.

Linear Algebra Done Right (Undergraduate Texts in Mathematics)
A great book that exposes the method of proof as it used in Linear Algebra. This book is not for the beginner though. You do need some prior knowledge of the basics at least. It would be a good add-on to an existing course you are doing in Linear Algebra.

Follow @ProbabilityPuzIf you are looking to learn time series analysis, the following are some of the best books in time series analysis.

Introductory Time Series with R (Use R!)
This is good book to get one started on time series. A nice aspect of this book is that it has examples in R and some of the data is part of standard R packages which makes good introductory material for learning the R language too. That said this is not exactly a graduate level book, and some of the data links in the book may not be valid.

Econometrics
A great book if you are in an economics stream or want to get into it. The nice thing in the book is it tries to bring out a oneness in all the methods used. Econ majors need to be up-to speed on the grounding mathematics for time series analysis to use this book. Outside of those prerequisites, this is one of the best books on econometrics and time series analysis.