Learning objectives: Learn more about what probability
histograms are. Develop an intuitive feel for a major result of the
Central Limits Theorem: "As the number of draws from a box
grows large, the probability histogram for the sum of the draws comes to
resemble the normal curve."

Remember that we began the present discussion
by seeking license to work problems like this:

A charitable organization knows from long
experience that responses to "begging letters" have an
average of $2.90 and an SD of $10. If the organization sends out
10,000 letters to a random sample of potential donors, then about
what are the chances that this appeal will bring in at least $30,000?

We reasoned that the entire experiment (send
out 10,000 letters ; sum the donations ) could,
conceptually, be repeated an arbitrarily large number of timesand
that the massive list of sums thereby generated would tell us
about the long-term performance of the sampling exercise. If we
had access to such a collection of data, preferably in histogram
form, then we could easily answer the above question: wed
simply add up all the area to the right of $30,000.

Obviously it is not practicable to generate
such a histogram empirically. However, I have stated that the
histograms average and SD must be, respectively, the EV(sum)
and SE(sum). Therefore, if we could somehow show that the
histogram looked like the normal curve, we could perform a normal
approximation. This is where high-level statistical theory comes
in handy. Over the course of a century or two, a number of
mathematicians collectively proved the Central Limits Theorem (CLT).
This intellectual tour de force says, among other things,
that as the number of draws from a box grows large, the
probability histogram for the sum (or average or percent) of the
draws comes to resemble the normal curve.

In our last class we began to define the
probability histogram. We did this two ways. First, we suggested
that hypothetical histograms of the type described aboverepeat
the experiment an arbitrarily large number of times; summarize
the sums by means of a histogramare roughly what we mean by
"the probability histogram." Second, we attempted a
definition by examples. Probably you remember how we drew the
probability histogram for the simple experiment, "take the
sum of one roll of a fair die." We listed
possible outcomes (1, 2, 3, 4, 5, 6) along the horizontal axis.
Then, above those outcomes, we drew rectangles whose areas were
equal to the outcomes probabilities. In the "one roll"
example, we made all rectangles of equal size because if the die
be fair, then all possible outcomes are equally likely. Here is a
sketch of that probability histogram:

Note that every rectangle has the same area.
That is what we want; areas represent probabilities, and if the
die is fair, then every spot should have the same probability.

At this point, then, you should know what a
probability histogram is, and you might be willing to accept on
faith the assertion about approach to normalityand thats
good, for we are definitely not going to prove the CLT. On the
other hand, perhaps we can at least offer some gentle persuasion
that will allow your faith to grow. To do so, let us return to
the problem of dice sums. Clearly the above probability
histogram, for the one-die roll, looks nothing like the normal
curve. But next lets consider the sum of two rolls.
Clearly, 2 is not very large; still, it is bigger than 1, and if
the CLT is true, perhaps we can begin to glimpse, even with only
two draws, some approach to normality. To draw the probability
histogram for the two-roll experiment, first erect your axes and
then list the possible outcomes along the horizontal one. Now
figure how big each of the eleven rectangles should be. Probably
you should do this by brute force. There are 36 different,
equally likely, draw combinations (the first draw has six
different possibilities; for each of these, the second draw has
six different possibilities ). How many of these draw-combinations
give you a "2"? (Answer: only one.) So how much area
goes in the rectangle to be erected above the "2"? (Answer:
1/36.) Fill in the table, and be sure you can draw the
probability histogram.

Possible sum

2

3

4

5

6

7

8

9

10

11

12

Number of
ways to get it

1

Probability
of sum

1/36

Heres the probability histogram, with the
dice-roll combinations written in the rectangles:

I admit that the probability histogram for the
two-dice toss doesnt look very much like the normal curve.
However, (1) it looks more like the normal curve than the
histogram for the one-die toss, and (2) two draws from a box isnt
a very large number. (I leave it to you to draw the histogram for
the three-die toss or the four-die tossif you are entirely
lacking in more useful employment.)

Now we can move on to another, easier
experiment. Consider tossing a fair coin and counting "heads."
Lets start with one toss, then do two tosses, then three
tosses, then four tosses. That is, well increase the number
of draws from one to two to three to four. If the CLT is true,
then we should begin to observe some modest approach to normality.
As a climax, you might want to complete the four-toss table:

Number of
"heads"

0

1

2

3

4

Combinations
that give that number of "heads"

TTTT

TTTH

TTHT

THTT

HTTT

TTHH HHTT

(You fill in the rest of them if you like .)

HHHT

HHTH

.

HHHH

Count of
combinations

1

4

1

Probability

1/16

4/16

1/16

Ill draw the probability histograms for
the one-coin toss (one draw from the box) and the four-coin toss
(four draws from the box). If you like, you can draw probability
histograms for 2 and 3 tosses.

After you have sketched and examined the
probability histograms, youll see why mathematicians
suspected the truth of the CLT before it was proved. But 4 isnt
really a very large number. On p. 316 your book draws probability
histograms for 100, 400, and 900 draws from the "fair coin"
box. Then p. 320 shows probability histograms for draw from a box
for a very unfair coin. All this should help convince you
that As the number of draws from a box grows large, the
probability histogram for the sum (or average or percent) of the
draws will come to resemble the normal curve.

Heres something else you should note. The
list of numbers that gave rise to the Four-Coin-Toss probability
histogram was [0, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4].
You should figure the average and SD of that list the old way (like
you learned back before the first test). You will find that the
average is ____ and the SD is ____. Now you should also figure
the EV(sum) and SE(sum). These are:

EV(sum) = ave(box)(# of draws) = (1/2)(4) = 2 (Is
this the average of the probability histogram?)