Thursday, October 20, 2011

My favorite Bayes's Theorem problems

This week: some of my favorite problems involving Bayes's Theorem. Next week: solutions.

1) The first one is a warm-up problem. I got it from Wikipedia (but it's no longer there):

Suppose there are two full bowls of cookies. Bowl #1 has 10 chocolate chip and 30 plain cookies, while bowl #2 has 20 of each. Our friend Fred picks a bowl at random, and then picks a cookie at random. We may assume there is no reason to believe Fred treats one bowl differently from another, likewise for the cookies. The cookie turns out to be a plain one. How probable is it that Fred picked it out of Bowl #1?

This is a thinly disguised urn problem. It is simple enough to solve without Bayes's Theorem, but good for practice.

A friend of mine has two bags of M&Ms, and he tells me that one is from 1994 and one from 1996. He won't tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is the probability that the yellow M&M came from the 1994 bag?

3) This one is from one of my favorite books, David MacKay's "Information Theory, Inference, and Learning Algorithms":

Elvis Presley had a twin brother who died at birth. What is the probability that Elvis was an identical twin?

To answer this one, you need some background information: According to the Wikipedia article on twins: ``Twins are estimated to be approximately 1.9% of the world population, with monozygotic twins making up 0.2% of the total---and 8% of all twins.''

4) Also from MacKay's book:

Two people have left traces of their own blood at the scene of a crime. A suspect, Oliver, is tested and found to have type O blood. The blood groups of the two traces are found to be of type O (a common type in the local population, having frequency 60%) and of type AB (a rare type, with frequency 1%). Do these data (the blood types found at the scene) give evidence in favour [sic] of the proposition that Oliver was one of the two people whose blood was found at the scene?

5) I like this problem because it doesn't provide all of the information. You have to figure out what information is needed and go find it.

According to the CDC, ``Compared to nonsmokers, men who smoke are about 23 times more likely to develop lung cancer and women who smoke are about 13 times more likely.''If you learn that a woman has been diagnosed with lung cancer, and you know nothing else about her, what is the probability that she is a smoker?

6) And finally, a mandatory Monty Hall Problem. First, here's the general description of the scenario, from Wikipedia:

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say Door A [but the door is not opened], and the host, who knows what's behind the doors, opens another door, say Door B, which has a goat. He then says to you, "Do you want to pick Door C?" Is it to your advantage to switch your choice?

The answer depends on the behavior of the host if the car is behind Door A. In this case the host can open either B or C. Suppose he chooses B with probability p and C otherwise. What is the probability that the car is behind Door A (as a function of p)?

19 comments:

Suppose there are two full bowls of cookies. Bowl #1 has 10 chocolate chip and 30 plain cookies, while bowl #2 has 20 of each. Our friend Fred picks a bowl at random, and then picks TWO cookies at random. We may assume there is no reason to believe Fred treats one bowl differently from another, likewise for the cookies. BOTH the cookies turns out to be a plain one. How probable is it that Fred picked them out of Bowl #1?I just wanted to know how to solve the problem of the cookies if in the above cookies problem, if two cookies were selected at a time , then what is the probability that both the cookies are from bowl#1??how to solve this, please tell me.

Hi Nikhil, Good question. Maybe I will post an update with a longer answer, but here's the short version. You can extend the analysis shown above to handle this data by computing the likelihood of the data under each hypothesis. For example, the likelihood of drawing two vanilla cookies (without replacement) from a bowl with 20 vanilla and 20 chocolate cookies is (20)(19) / (40)(39), which is a little less than 1/4.

Nice post! To answer the Elvis question, you actually still need some additional information (from About.com): Among fraternal twins, 1/3 are two girls, 1/3 are two boys, and 1/3 are a boy and a girl. Also (just to be perfectly thorough) identical twins are certain to have the same gender.

A biometric security device using fingerprints erroneously refuses to admit 1 in 1,000 authorized persons from a facility containing classified information. The device will erroneously admit 1 in 1,000,000 unauthorized persons. Assume that 95 percent of those who seek access are authorized. If the alarm goes off and a person is refused admission, what is the probability that the person was really authorized?

I realize this is an older post, but it is great stuff; I was watching your video on YouTube that covers this material and stopped to work it out myself. On the m&m problem, there are actually two issues: one which you've outlined above, but the second could be stated as, "What are the chances that the dispenser of m&m's is not a good friend, given the early date of those bags of candy?!" http://www.eatbydate.com/other/sweets/mms/

Bayes Theorm : In bayes theorm if I know the value of p(symptoms/disease) let say 0.3 so, is it justifiable if I take p(~symptoms/disease)=(1-p(symptoms/disease)) p(symptoms/~disease) = (1-p(symptoms/disease))?

The following data is about a poll that occurred in 3 states. In state1, 50% of voters support Party1, in state2, 60% of the voters support Party1, and in state3, 35% of the voters support Party1. Of the total population of the three states, 40% live in state1, 25% live in state2, and 35% live in state3. Given that a voter supports Party1, what is the probability that he lives in state2?

Simplest way to answer this question is to assume a number for total population.Assume that 400 is the combined population of all three states. Then each of state1,state2,state3 will be with population 160,100,140 respectively. From the data provided we can derive that number of people supporting party1 in state1,state2,state3 are 0.5*160,0.6*10,0.35*140 (= 80,60,49). From the above derived information we can evaluate the probability that a party1 supporter will be from state2 is 60/(80+60+49) = 0.317 Approx.

I have a question. Exactly 1/5th of the people in a town have Beaver Fever . There are two tests for Beaver Fever, TEST1 and TEST2. When a person goes to a doctor to test for Beaver Fever, with probability 2/3 the doctor conducts TEST1 on him and with probability 1/3 the doctor conducts TEST2 on him. When TEST1 is done on a person, the outcome is as follows: If the person has the disease, the result is positive with probability 3/4. If the person does not have the disease, the result is positive with probability 1/4. When TEST2 is done on a person, the outcome is as follows: If the person has the disease, the result is positive with probability 1. If the person does not have the disease, the result is positive with probability 1/2. A person is picked uniformly at random from the town and is sent to a doctor to test for Beaver Fever. The result comes out positive. What is the probability that the person has the disease?