Excel Project

We can use a spreadsheet and the RANDBETWEEN function to simulate the tossing of a fair coin 10 times. We will use a '1' to denote a head and a '2' to denote a tail. Figure 16.1 illustrates possible output of tossing a coin 10 times. To generate this spreadsheet, we place the formula

RANDBETWEEN(1,2)

in cell A2 and copy it down column A to cells A3 through A11. Next, in cell C2, we use the formula

=COUNTIF(A2:A11,1)

which counts the number of times a '1' occurs in column A.
In cell D2, we use the similar formula =COUNTIF(A2:A11,2). The beauty of the spreadsheet is that if we want to run the simulation again in Excel, we simple need to hit CTRL+ALT+F9.

Figure 16.1: Simulation of 10 coin tosses.

Suppose now we would like to repeat this experiment 10 times and a bar graph is to be drawn showing how many heads are tossed on each trial. Figure 16.2 shows one possible design.

Figure 16.2: Simulation of 10 trials of 10 coin tosses.

There are several key formulas in the makeup of this spreadsheet. In cell B3, the formula RANDBETWEEN(1,2)
is used to generate a heads or a tails (represented by the numbers '1' and '2' respectively). This formula is then copied to cells located in the block B3 to K12 in order to generate 100 coin toss outcomes.

Cell B14 contains the formula

=COUNTIF(B3:B12,1)

which counts the number of heads that occured in the first trial (10 flips). Similarly, in cell B15 we type =COUNTIF(B3:B12,2)
which counts how many tails occured in the first trial. Copying these formulas across to cells B14 through K15 finishes our counting of heads and tails in each trial.

Finally, a frequency table is made. After row 17 containing the integers between 0 and 10 is generated (how?), we place in cell B18 the formula

=COUNTIF(\$B14:\$K14,B17)

which is promptly copied to cells B18 through L18. The use of the $ in front of the column indices B and K, but not in front of the column index B, allows us to count the frequency with which each number of heads showed up in the 10 trials.

In order to visually see the results of our process, we use the spreadsheet to generate a bar graph using the information found in rows 17 and 18. The exciting part of our design is that CTRL+ALT+F9 will automatically regenerate both new data and a new bar graph. With only 10 trials, our bar graph is not very nearly normal,
but there is little more work involved in extending this process to 100 trials in which case the bar graph does appear to be more normal.

Suppose N students take a college entrance exam. Further, suppose the scores on the exam have an approximately normal distribution with mean
points and standard deviation
points. We wish to design a spreadsheet which allows the user to input N, , and , and will output to a fourth cell an estimate of the first-quartile score S for the exam. Also, suppose we wish to output to a fifth cell an estimate of the number of students scoring below the first quartile
on the exam.

We start our spreadsheet by placing the value of N in cell A3, the value of in B3, and the value of in C3. Since the first quartile
should be approximately points below the mean ,
we place the formula

=B3-0.675*C3

in cell D3. In cell E3, we type =A1/4 to calculate 1/4 of the number of students taking the exam. Figure 16.3 shows the outcome of the spreadsheet when applied to data for the verbal part of the 1996 SAT exam.

Figure 16.3: Simulation of 10 trials of 10 coin tosses.Estimating the first quartile for the 1996 SAT exam.

Given a normal distribution
with mean
and standard deviation ,
Microsoft Excel has a built in function which computes (approximately) the percent of data having a standardized value
(sometimes called a z-score)
less than x. For example, for a normal distribution
with mean =0
and standard deviation =1, the percent of data having a standardized value less than 0.675 is approximately 75.0162. The formula used to calculate this is

=NORMDIST(0.675,0,1,TRUE)

As an example of the usefulness of the NORMDIST
function, consider the verbal part of the 1996 SAT exam. The scores on the exam fit an approximately normal distibution
with mean =505
and standard deviation =110.
Suppose we would like to estimate the percentage of scores which fell between 500 and 650. The process we would use in Excel would be to first compute the standardized values of 500 and 650. For this, we could use the STANDARDIZE command. To compute the standardized value
of 500 we could use the formula

=STANDARDIZE(500,505,110)

in cell A1 which would output -0.045455. Similarly, in cell A2 we could compute the standardized value of 650 to be 1.318182. To compute the percent of standardized values falling between -0.045455 and 1.318182, we could use the formula

=NORMDIST(A2,0,1,TRUE)-NORMDIST(A1,0,1,TRUE)

which would output 0.424406. This would mean about 42.4\% of the scores fell between 500 and 650.

Walking

Exercise 16.1 An honest coin is tossed N times. Let the random variable X represent the number of times the coin comes up heads. Design a spreadsheet in which the user inputs N in one cell and the spreadsheet outputs to two other cells the mean
and standard deviation of the random variable X.

Exercise 16.2 A dishonest coin with probability of heads being tossed p is tossed N times. Let the random variable X represent the number of times the coin comes up heads. Design a spreadsheet in which the user inputs the number N in one cell and the number p in another cell and outputs to two other cells the mean
and standard deviation of the random variable X.

Exercise 16.3 Suppose N students take a college entrance exam. Further, suppose the scores on the exam have an approximately normal distribution with mean points and standard deviation points.

(a) Design a spreadsheet which allows the user to input N, , and , and will output to a fourth cell an estimate of the third-quartile score S for the exam.
(b) Design a spreadsheet which allows the user to input N, , and , and will output to a fourth cell an estimate of the inter-quartile range for the exam.
(c) Design a spreadsheet which allows the user to input N, , and , and will output to two other cells estimates of the maximum and minimum for the exam.

Exercise 16.4 The scores on the verbal part of the 1996 SAT exam fit an approximately normal distribution
and standard deviation =110.
Use a spreadsheet to compute the percent of scores falling between 290 and 720.

Jogging

Exercise 16.5 [Excursions, Chapter 16, Exercises 13, 14]
Suppose that a normal distribution
has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input , , and a number x and will output to a fourth cell the standardized value
of x. Note: In Excel, the command

STANDARDIZE(x,mean,standard deviation)

can be used to simplify the process. However, this exercise is for you to design a spreadsheet that acts like the STANDARDIZE function.

Exercise 16.6 [Excursions, Chapter 16, Exercise 17]
Suppose that a normal distribution has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input , , and a standardized value z and will output to a fourth cell the data value having standardized value z.
Test your design on input of =183.5, =31.2, and z=-1.

Exercise 16.7 [Excursions, Chapter 16, Exercise 19]
Suppose that a normal distribution
has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input and a standardized value z for another input data value x and will output to a fourth cell the standard deviation .
Test your design on input of =50, x=84, and z=2.

Exercise 16.8 [Excursions, Chapter 16, Exercise 21]
Suppose that a normal distribution
has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input and a standardized value z for another input data value x and will output to a fourth cell the mean .
Test your design on input of =15, x=50, and z=3.

Exercise 16.9 Suppose that a normal distribution
has a standard deviation
of . Design a spreadsheet that allows the user to input and two numbers x and y (y>x) and will output to a fourth cell how far apart x and y are (as measured in standard deviations).

Exercise 16.10 A dishonest coin with probability of heads being tossed p is tossed N times. Let the random variable X represent the number of times the coin comes up heads. Design a spreadsheet in which the user inputs the number N in one cell and the number p in another cell and outputs to two other cells the first
and third quartile
of the random variable X.

Running

Exercise 16.11 An assembly line produces 100,000 light bulbs a day, 20% of which generally turn out to be defective. Suppose we draw a random sample of n=100 light bulbs. Let X represent the number of defective light bulbs in the sample.

(a) Use a spreadsheet to compute the percentage chance that the number of defective light bulbs in the sample will be less than 25.
(b) Use a spreadsheet to compute the percentage chance that the number of defective light bulbs in the sample will be greater than 17.

Exercise 16.12 [Excursions, Chapter 16, Exercise 67]
An urn contains 10,000 beads of which 20% are red and the rest white. Suppose that we draw a sample of size 400 (drawing a bead and each time replacing it in the urn before drawing again). What are the approximate chances that the number of white beads in the sample will be 310?