Bayes Theoremv1.1

The Bayesian
thing is pretty cool once you wrap your head around it...presuming that
I have... To start, I found this nice page that describes what it means in my
particularly practical standpoint: http://yudkowsky.net/rational/bayes

Basic Bayes Background

This is pretty much verbatim from section 13.5.1 on AIMA page 496...I'm going to use disease for cause and symptom for effect because the subsequent examples are medically motivated. The Bayesian equation is:

The Probability of having a disease, given that you have a symptom
EQUALSThe Probability of having that symptom, given that you have that disease
TIMESThe Probability of that disease
(both)
DIVIDED BY
The Probability of having that
symptom

So about .14% of the population who have stiff necks also have meningitis.

Running the Numbers

The yudkowsky page is way long. Reduced to the minimum it says
that, for a seemingly binary
test, there are four possible results:

True Positives -- Positive results that are correct;

False Negatives -- Negative results that are actual
positives.

False Positives -- Positive results that are
actual negatives;

True Negatives -- Negative results that are correct;

In order to figure out what a test result means you really need to know
the Falsebits...and one other piece of information: The expected results, or Prior.
For the Breast Cancer example on the yudkowsky page it goes like this:

1% of women at age forty who
participate in routine screening have breast cancer. 80% of
women
with breast cancer will get positive mammographies. 9.6% of
women
without breast cancer will also get positive mammographies. A
woman in this age group had a positive mammography in a routine
screening. What is the probability that she actually has
breast
cancer?

To recap:

Actual rate in the population: 1% = 0.01

True Positives: 80% = 0.80

False Negatives: 20% = 0.20

False Positives: 9.6% = 0.096

Out of a Population of 10,000 where the Rate is 0.01,
there should be 100 who are Real Positives, and the test results
will be:

(Note that the numbers magically add up to 10,000)! So, 1030
[80 +
950] will be Positive Test, but a positive result is really
positive about [(80 / 1030) = 0.0776] ~7.8% of the time. Without knowing
the
"expected prior probability" of the result we cannot evaluate its
actual probability. This also works nicely in the degenerate cases. If
you
have a perfect test with 0% False Positive/Negative results, you get
the
expected 100 True Positives and
9900 True Negatives. And if you have 0% Real Positives in
the population you get the expected 960 False Positives and 9040 True
Negatives.

So, why's that Bayesian?

Lets do it another way then. Here's what we know in a different light:

Finding the Prior

Estimating the expected Prior probability gives you a
better handle on the problem, and a way to revise the actual results.
But I think you can use this all to calculate the Real Positive value
when it is not known. Like this.

Just for the Exercise

Chapter 13 of the AIMA book covers this sort of probability reasoning
and Exercise 13.15, AIMA page 508, is exactly this problem:

After your yearly checkup, the doctor has bad news and good news. The
bad news is that you tested positive for a serious disease and that the
test is 99% accurate (i.e., the probability of testing positive when
you do have the disease is 0.99, as is the probability of testing
negative when you don't have the disease). The good news is that this
is a rare disease, striking only 1 in 10,000 people of your age. What
are the chances that you actually have the disease?

To recap:

Actual rate in the population: 1/10,000 = 0.0001

True Positives: 99% = 0.99

True Negatives: 99% = 0.99

Giving:

False Positives: 1% = 0.01

False Negatives: 1% = 0.01

Out of a Population of 1,000,000 where the Rate is 0.0001,
there should be 100 who are Real Positive. The test results
will be: