Subscribe to this blog

Follow by Email

Search This Blog

Finding the Faulty Coin

Q: You have a large number of coins out of which some small fraction \(p\) are faulty ones. The faulty ones have slightly different weight (maybe more or less) and for some reason the error-free-test to weigh the coins is expensive and you want to minimize the number of tests done. So you embark on weighing them in bulk lots. What is the optimal lot size so that you minimize the number of weighs per coin.

Probability Theory: The Logic of Science
A: Lets assume we went with a lot size of \(x\). If we can compute the expected number of weighs per coin needed for a lot size of \(x\) then this strategy must apply to the entire set of coins. So we need not know the entire lot size.

Next, let us try and estimate the expected number of weighs needed. There are two scenarios.

You weigh the entire lot and you find no discrepancy in the weight

You weigh the entire lot and you find a discrepancy in the weight.

Next we note the following

The probability that any one coin is faulty is \(p\)

The probability that any one coin is not faulty is \(1 - p\).

The probability that all coins are not faulty is \((1-p)^{x}\).

The probability that at least one coin is faulty is \(1 - (1-p)^{x}\)

The expected number of tests is at least 1 per lot. However, with probability \(1 - (1-p)^{x}\) you would have to weight all the \(x\) coins. This results in a total expectation of
$$
E(\text{tests}) = 1 + x\big[1 - (1-p)^{x}]
$$
The expected number of tests per coin is given by
$$
E(\text{tests/coin}) = \frac{1}{x} + 1 - (1-p)^{x}
$$
If we know that \(p\) is small, you can approximate the second term in the above equation as
$$
(1-p)^{x} \approx 1 - xp
$$
Which simplifies the expected number of weighs per coin as follows
$$
E(\text{tests/coin}) = \frac{1}{x} + xp \\
$$
Differentiating the above w.r.t. \(x\) and setting it to 0 yields
$$
\frac{dE}{dx} = -\frac{1}{x^2} + p = 0
$$
Implying
$$
x_{opt} = \frac{1}{\sqrt{p}}
$$

If you are looking to learn probability here are some good books to own

This is a great book to own. The second half of the book may require some knowledge of calculus. It appears to be the right mix for someone who wants to learn but doesn't want to be scared with the "lemmas"

This one is a must have if you want to learn machine learning. The book is beautifully written and ideal for the engineer/student who doesn't want to get too much into the details of a machine learned approach but wants a working knowledge of it. There are some great examples and test data in the text book too.

This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

Discovering Statistics Using R
This is a good book if you are new to statistics & probability while simultaneously getting started with a programming language. The book supports R and is written in a casual humorous way making it an easy read. Great for beginners. Some of the data on the companion website could be missing.

Linear Algebra (Dover Books on Mathematics)
An excellent book to own if you are looking to get into, or want to understand linear algebra. Please keep in mind that you need to have some basic mathematical background before you can use this book.

Linear Algebra Done Right (Undergraduate Texts in Mathematics)
A great book that exposes the method of proof as it used in Linear Algebra. This book is not for the beginner though. You do need some prior knowledge of the basics at least. It would be a good add-on to an existing course you are doing in Linear Algebra.

Follow @ProbabilityPuzIf you are looking to learn time series analysis, the following are some of the best books in time series analysis.

Introductory Time Series with R (Use R!)
This is good book to get one started on time series. A nice aspect of this book is that it has examples in R and some of the data is part of standard R packages which makes good introductory material for learning the R language too. That said this is not exactly a graduate level book, and some of the data links in the book may not be valid.

Econometrics
A great book if you are in an economics stream or want to get into it. The nice thing in the book is it tries to bring out a oneness in all the methods used. Econ majors need to be up-to speed on the grounding mathematics for time series analysis to use this book. Outside of those prerequisites, this is one of the best books on econometrics and time series analysis.