If you were the owner and wanted to show how well you paid you would say your plant paid an average salary of $20,333. If you were a worker who wanted an increase you would say that the average wage was $12,000.

To look at the variation in this set of data we want to find the range and make s box plot. The range of a set of data is the difference between the largest value and the smallest. Using the data from the XYZ Plant above we have the range equals $60,000 8,000 = $52,000.

Range = largest value smallest value

The range gives us an indication of how far the data is spread. It is an indicator of variance, but it does not give us any information about how the individual values are distributed or how they vary. For this a box plot can be helpful.

(figure available in print form)

The box plot is a quick way of seeing how the data is distributed. There is a lot of information presented. To construct a box plot make a line and impose a scale on it that will include your lowest and highest values. Next plot your values by putting an x above the proper number. If there are more than one of the same value, stack them.

Next draw a light dotted line down indicating the median value. For our data it is $12,000. Next put a dot indicating the lowest and highest values. There are only two more numbers we have to find, the upper and lower quantities. The upper quartile is the median of the data above the set median. The lower quartile is the median of the data below the set median. For our example, the set median is $12,000. There are four values above it and four below. The median of the upper four values is 40,000 + 15,000 = 27,000 this is called the upper quartile, Doing similarly for the lower quartile we get 12,000 + 9,000 = 10,500. Mark lines indicating the quantities and draw a box going from the upper quartile to the lower quartile as shown. Finally draw a line going from the upper quartile end of the box to the highest value and similarly for a line going from the lower quartile end to the lowest value. These end lines are called the whiskers of the box plot. To do this is quick and easy once you have learn ed the pattern .

Taking a look at some of the information presented in our box plot, notice that the line in the center area of the box is the median so that tells us half the values are greater than or equal to it and half the values are less than or equal to it. The upper end of the box divides in half the frequencies of the upper-valued data and similarly for the lower end of the box for the lower valued data. The shape of the box will change as the set of data changes. It is a model that makes it easy to talk about distributions. One can easily say and understand things like, “There are 2 values on the lower whisker of this box whereas the last one we did had 6.”

However we read it, the first number that is from 01 to 20 inclusive gives us the first member of the committee, and we keep going discarding numbers that do not have meaning for our task. If we are selecting a committee of three, we keep going until we get three numbers from 01 to 20 inclusive, and the names that correspond to these numbers are the ones on our committee. If we use the table above and keep counting the way we originally began we would read the numbers 78,85,53,32,21,12,22, 22,21,11,17. That would give us 12, 11 and 17. The names corresponding to 12, 11 and 17 would make up our class’s randomly selected committee. The task is now completed.

By the time the class has selected a committee of three, then each student selects his or her own committee, #2 on Form E, most students probably will be able to do the random sample problem #3 on Form E.

#3 Form E. A batch of 200 new cars has just been completed. Your job is to randomly select 15 of the cars for a special safety check.

a. Describe how to do this.

b. Select the 15 cars. Use random number table on handout.

c. List the 15 numbers selected.

For this problem you will want a larger random digit table than the ones generated. Form F. A classroom set of copies of a random table is needed. Solution:

a. Number all the new cars 000 to 200 inclusive. Arbitrarily select a place to start on the larger table, decide if to read across, down or diagonally and begin reading in groups of three digits. Any three digit number 000 to 200 inclusive we keep, and any others we discard. Continue until we have 15 useful numbers. The cars with these numbers will be used for the special check.

Two possible extensions might be to use this method to take a survey or to do a simulation problem. To take a survey of the student body requires that several decisions be made. One decision is what question or questions do you want to ask? Since this unit deals with numerical values, you’ll want numerical data back so you can evaluate it using the techniques from Section I. Possible questions could be “How much soda do you drink in a week?” or “What do you expect your annual income to be ten years from now?”

Another decision is how large a sample do you want? What is an adequate size? Too small and it may not be valid. Too large a sample may be too much work to do. Thirty seems to be a good size with which to work. Once you have the size of your sample, how will you go about getting a random sample, gathering the data, analyzing the results? Can you publish the results in the school newspaper?

The second extension could be this simulation problem from UnderstandableStatistics, Brase/Brase, p13.

A single pollen grain floating on the surface of water will move randomly from the impact of the water molecules. The task is to chart the course of a pollen grain as it moves on a drop of water for seven position changes. A problem, however, is that the pollen grain is so small and its movements are so fast that you would need to use a microscope and slow motion camera to see the changes. Since you do not have this equipment, you will have to use a random number table to simulate the observed direction of the pollen grain for seven position changes.
Instructions. Allow that for each position change, the pollen grain is in the center of a circle marked in degrees as shown below. 0 degrees indicates east, 90 degrees indicates north, 180 degrees indicates west, and 270 degrees indicates south. The arrow points to the direction of change.

(figure available in print form)

Solution. Using a random number table, arbitrarily decide where to begin and in which direction you will read. Then, since there are 359 possible positions, begin reading in groups of three digits. Keep the numbers that are between 000 and 359 inclusive and discard those that are not. When you have seven such numbers, chart the position changes according to the instructions above. A possible looking solution might be as follows.

This can be written as P(E) = n(E) where P(E) means n(S) the probability of event E happening, n(E) means the number of times E could happen and n(S) means the number in the sample set which is the total number of possible outcomes.

Consider a die. It has six surfaces, and each surface has a set of 1, 2, 3, 4, 5, or 6 dots on it. If I roll a die, the only possible outcomes are 1, 2, 3,4 , 5 or 6. These six elements make up the sample set for our event the rolling of the die.

If I roll a die, I can ask for the probability of different events happening. What is the probability of the following.

a. P(1) = ___

b. P(even number) = ___

c. P(8) = ___

d. P(n > 5) = ___ where n means the number on the die

e. P(odd number) = ___

f. P(n 7) = ___

Since each of these is answered by P(E) = n(E)/n(S), the n(S) answers are as follows.

(figure available in print form)

Notice O means no possibility the event will happen. 1 means it will always happen. The probability of an event will always be between 0 and 1 or equal to one of them.

(figure available in print form)

If I roll a die, P(2) = 1/6. This could be written 1/6 as 1 or as 0.166. P( n 4) = 1/2 or 0.5. This may be an interesting way to review students’ basic skills in fractions and decimals. We’ll use both below.

I want to roll a die 12 times to see if the probability of getting 4 really is 1/6 as indicated by the definition.

Theoretical probability is what we have been talking about up to this point. Now we want to move out of the theoretical into the real world and try out that probability with a real die. I’ll now roll a die 12 times.

results: theoretical probability P(4) = 1/6

experimental probability EP(4) = 4/12 = 1/3

In class one student could roll the die, another could tally it on the board. If we’re lucky there will be a discrepancy to point out the difference between theoretical and experimental probability with a small sample of 12.

At this point, letting students roll dice and get how many times 4 comes up for each of them could be organized as follows.

Times Roll Die

Number of 4’s

P(fraction)

P(decimal)

12

___

___

___

20

___

___

___

30

___

___

___

For each of the above three experiments, have students calculate the experimental probabilities they find using both fractional and decimal forms.

Notice that by using the basic definition of probability we can find simple probabilities, both theoretical, the probabilities that you might expect, and the experimental, the probabilities you get in the real world by doing experiments like roll a die, flip a coin, or draw a card from a deck. Further, by using small samples the experimental probability might be quite different from the theoretical, but as we increase the number of tries, that is as the number in our sample increases, the experimental probability moves closer and closer to the theoretical. How large a sample is needed? Again, 30 is usually considered to be fairly reliable sample.

Other easy activities done in the same or similar way are to ask how the theoretical and experimental probabilities compare for P(n > 1) in rolling a die, or for P(T) the probability of getting tails when flipping a coin, or P(2) the probability of drawing a two from a pack of cards.

In summary, with simple probability problems we can use the basic definition of probability to experiment with the difference between theoretical and experimental probabilities. The “simple” here means problems where it is easy to count the numbers you need as opposed to more difficult probability problems where the basic idea is the same but the counting of needed numbers becomes more difficult.