File - Charles Whiffen`s Math & Stats Website

MM207
Statistics
Welcome to the Unit 8
Seminar
Prof. Charles Whiffen
Sampling Distributions
• Sampling Distributions: A sampling distribution is a
distribution of statistics obtained by selecting all the
possible samples of a specific size from a population.
• Distribution of Sample Means: A sampling distribution
of the mean gives all the values the mean can take, along
with the probability of getting each value if sampling is
random from the null-hypothesis population.
• Distribution of Sample Proportions: The distribution
that results when we find the proportions (ˆp) in all possible
samples of a given size.
Sampling Error
• Sampling Error: The discrepancy between the statistic
obtained from the sample and the parameter for the
population from which the sample was obtained. For
example, the mean (¯x) calculated from a sample will not
always equal the population mean ().
Central Limit Theorem*
• Central Limit Theorem: For any population
with mean  and standard deviation , the
¯
distribution
of sample means x for sample size
n will have a mean of  and a standard
deviation of /n, and will approach a normal
distribution as n approaches infinity (n >30 is
the general rule).
* See Page 217
Distribution of Sample Means Example
• Consider the following data as a Population
2, 4, 6, 8
• The population mean is 5
• The population standard deviation is 2.236
• Now we are going to take ALL possible samples
of n = 2 from this population.
• We will calculate the mean for each sample
Sampling Distribution of Means for
Samples of n = 2
Pick 1
Pick 2
Mean
Mean
2
2
2
2
4
4
4
4
6
6
6
6
8
8
8
8
2
4
6
8
2
4
6
8
2
4
6
8
2
4
6
8
2
3
4
5
3
4
5
6
4
5
6
7
5
6
7
8
80
4
9
16
25
9
16
25
36
16
25
36
49
25
36
49
64
440
2
Variance
Standard
Deviation
0
2
8
18
2
0
2
8
8
2
0
2
18
8
2
0
0.000
1.414
2.828
4.243
1.414
0.000
1.414
2.828
2.828
1.414
0.00
1.414
4.243
2.828
1.414
0.00
Central Limit Theorem Applied
•
¯x = 80/16 = 5, which equals the population mean. So we have shown that the
mean of the means is equal to mu or the population mean.
•
•
Sx
= √X2 – (X)2/N / N
= √440 – (80)2/16 / 16 (notice we divide by N since this is a population).
= √40/16
= √2.5
= 1.58
Now, we will calculate what the Central Limit Theorem tells us the standard
deviation will be. It is
σx = σ/ √n
= 2.236 / √2
= 2.236 / 1.14142
= 1.58
Distribution of Sample Proportions
The distribution of sample proportions is the distribution
p ) in all possible
that results when we find the proportions (ˆ
samples of a given size.
The larger the sample size, the more closely this distribution
approximates a normal distribution.
In all cases, the mean of the distribution of sample
proportions equals the population proportion.
p , is
If only one sample is available, its sample proportion,ˆ
the best estimate for the population proportion, p.
Margin of Error
The margin of error for the 95% confidence interval is
margin of error = E ≈
1.96s
n
where s is the standard deviation of the sample.
NNote: Our text uses 2
but use 1.96 for 95%
confidence instead.
We find the 95% confidence interval by adding and subtracting the margin of
error from the sample mean. That is, the 95% confidence interval ranges
from (x – margin of error)
to
(x + margin of error)
We can write this confidence interval more formally as
¯
x – E < μ < ¯x+ E
or more briefly as
¯
x ±E
95% Confidence Interval
Constructing a Confidence Interval
• A study finds that the average time spent by eighth-graders
watching television is 6.7 hours per week, with a margin of
error of 0.4 hour (for 95% confidence). Construct and interpret
the 95% confidence interval
• The best estimate of the population mean is the sample mean,
¯x = 6.7 hours.
• We find the confidence interval by adding and subtracting the
margin of error from the sample mean, so the interval extends
from 6.7 – 0.4 = 6.3 hours to 6.7 + 0.4 = 7.1 hours.
Interpreting the Confidence Interval
Figure 8.10
Interpreting the Confidence Interval
Figure 8.10 This figure
illustrates the idea behind
confidence intervals. The
central vertical line
represents the true
population mean, μ. Each
of the 20 horizontal lines
represents the 95%
confidence interval for a
particular sample, with the sample mean marked by the dot in the
center of the confidence interval. With a 95% confidence interval, we
expect that 95% of all samples will give a confidence interval that
contains the population mean, as is the case in this figure, for 19 of
the 20 confidence intervals do indeed contain the population mean.
We expect that the population mean will not be within the confidence
interval in 5% of the cases; here, 1 of the 20 confidence intervals
(the sixth from the top) does not contain the population mean.
Determine Minimum Sample Size
• Solve the margin of error formula [E =1.96s/√n] for n.
 1.96 s 
n   E 


2
NNote: Our text uses 2
but use 1.96 for 95%
confidence instead.
• You want to study housing costs in the country by sampling recent house
sales in various (representative) regions. Your goal is to provide a 95%
confidence interval estimate of the housing cost. Previous studies suggest
that the population standard deviation is about $7,200. What sample size
(at a minimum) should be used to ensure that the sample mean is within
• a. $500 of the true population mean?
 1.96
n   E

2
  1.96  7,200 
2
  
  28.2  796.6
500

 
2
Core Logic of Hypothesis Testing
• Considers the probability that the
result of a study could have come
about if the experimental procedure
had no effect
• If this probability is low, scenario of no
effect is rejected and the theory behind
the experimental procedure is
supported
Hypothesis Testing using
Confidence Intervals
State the claim about the population mean
Determine desired confidence level
Select a random sample from the population
Calculate the confidence interval for the desired
level of confidence.
 If the claim is contained within the interval, the
claim is reasonable; if it is not within the interval,
the claim is not reasonable, at the given level of
confidence.
 See Testing a Claim document in Doc Sharing




Using StatCrunch
CI for Mean Example – IQ
Results from an IQ test administered to 45
randomly selected high school seniors showed a
mean IQ of 100 with a standard deviation of 16.
1. Find the point estimate for the mean IQ for all
high school seniors.
2. Find a 95% CI (confidence interval) for the mean
IQ for all high school seniors.
3. Find the MOE (margin of error).
Stat > Z statistics > One sample >
with summary
One sample Z stats with summary
One sample Z stats with summary
NNote: StatCrunch will
use 1.96 instead of 2 for
95% confidence level.
One sample Z stats with summary
(1)100; (2) 95.3 < µ < 104.7; (3) MOE = 4.7
Stat > Z statistics > One sample >
with data (similar process)
CI for Proportion Example – Menus
A study of 105 randomly selected restaurants in
Springfield found that 67 have a kids’ menu.
1. Find the point estimate for the proportion of all
restaurants which have a kids’ menu.
2. Find a 90% CI (confidence interval) for the
proportion of all restaurants in Springfield which
have a kids’ menu.
3. Find the MOE (margin of error).
Stat > Proportions > One sample >
with summary
One sample Proportion w/ summary
One sample Proportion w/ summary
One sample Proportion w/ summary
(1)0.638; (2) 0.561 < p < 0.715; (3) MOE = 0.077
QUESTIONS?
Review of Unit 8 Work
By Tuesday at Midnight
you must complete:
• Initial post to one
discussion question
• Two responses to
other student posts
to discussion
questions
• Live Binder
• MSL HW
• MSL Quiz
30