Stats without TearsSolutions for Chapter 11

View orPrint:
These pages change
automatically for your screen or printer.
Underlined text, printed
URLs, and the table of contents become live links on screen;
and you can use your browser’s commands to change the size of
the text or search for key words.If you print, I suggest black-and-white,
two-sided printing.

(a) Use MATH200A part 5 and select 2-pop binomial. You have no
prior estimates, so enter 0.5 for p̂1 and p̂2. E is 0.03,
and C-Level is 0.95. Answer: you need
at least 2135 per sample, 2135 people under 30 and 2135
people aged 30 and older. Here’s what it looks like, using
MATH200A part 5:

Caution! Even if you don’t identify
the groups, at least you must say “per sample”.
Plain “2135” makes it look like you need only that many
people in the two groups combined, or around 1068 per group, and that
is very wrong.

Caution! You must compute this as a
two-population case. If you compute a sample size for just one group
or the other, you get 1068, which is just about half of the correct
value.

If you don’t have the program, you have to use the
formula:
[p̂1(1−p̂1)+p̂2(1−p̂2)]·(zα/2/E)².
You don’t have any prior estimates, so p̂1
and p̂2 are both equal
to 0.5. Multiply out
p̂1 × (1−p̂1) × p̂2 ×
(1−p̂2) to get .5.

Next,
1−α = 0.95, so α = 0.05
and α/2 = 0.025. zα/2 =
z0.025 = invNorm(1−0.025). Divide that by E
(.03), square, and multiply by the result of the computation with the
p̂’s.

Alternative solution:
Using the formula, .3(1−.3)+.45(1−.45) =
.4575. Multiply by (invNorm(1−.05/2)/.03)² as before to
get 1952.74157 → 1953 per sample.

Again, you must do this as two-population binomial. If you do
the under-30 group and the 30+ group separately, you get sample sizes
of 897 and 1057, which are way too small. If your samples are that
size, the margins of error for under-30 and 30+ will each be 3%, but
the margin of error for the difference, which is what you
care about, will be around 4.2%, and
that’s greater than the desired 3%.

2

(a) You have numeric data in two independent samples.
You’re testing the difference between the means of two
populations,
Case 4 in Inferential Statistics: Basic Cases.
(The data aren’t paired because you have no reason to associate
any particular Englishman with any particular Scot.)

The problem states that samples were random.
For English, r=.9734 and crit=.9054; for Scots, r=.9772 and
crit=.9054. Both r’s are greater than crit, so both are
nearly normally distributed. The stacked boxplot shows no outliers.
And obviously the samples of 8 are far less than 10% of the
populations of England and Scotland.

At the 0.05 level of significance,
we can’t say whether English or Scots have a stronger liking for soccer.Or,We can’t say whether English or Scots have a stronger liking for soccer (p = 0.0690).

(b) Requirements are already covered.
2-SampTInt, C-Level=.90
Results: (−.2025, 3.5775)
We’re 90% confident that, on a scale from 1=hate to 10=love, the average Englishman likes soccer between 0.2 points less and 3.6 points more than the average Scot.

The English and Scots are not equally likely to be soccer fans, at the 0.05 level of significance;
in fact the English are less likely to be soccer fans.
Or,The English and Scots are not equally likely to be soccer fans, (p = .0308);
in fact the English are less likely to be soccer fans.

(b) Requirements already checked.

2-PropZInt with C-Level = .95 →
(−.1919, −.0081)

That’s the estimate for p1−p2, English
minus Scots. Since that’s negative, English like soccer less
than Scots do.
With 95% confidence, Scots are more likely than English to be soccer fans, by 0.8 to 19.2 percentage points.

Remark:
If this was a research study, they would probably test for a
difference in HDL, not just an increase. Maybe this
study was done by a fitness center or a running-shoe company. They
would want to find an increase, and HDL
decreasing or staying the same
would be equally uninteresting to them.

At the 0.05 level of significance, running 4 miles daily for six months raises HDL level.Or,Running 4 miles daily for six months raises HDL level (p = 0.0188).

(b) TInterval with C-Level .9 gives (1.3951, 7.8049).

Interpretation:
You are 90% confident that running an average of four miles a day for six months will raise HDL by 1.4 to 7.8 points for the average woman.

Caution! Don’t write something like
“I’m 90% confident that HDL will be 1.4 to 7.8”. The
confidence interval is not about the HLD level, it’s about the
change in HDL level.

Remark:
Notice the correspondence between hypothesis test and
confidence interval. The one-tailed HT at α = 0.05
is equivalent to a two-tailed HT at α = 0.10, and
the complement of that is a CI at 1−α = 0.90
or a 90% confidence level. Since the HT did find a statistically
significant effect, you know that the CI will not include 0. If the HT
had failed to find a significant effect, then the CI would have
included 0. See Confidence Interval and Hypothesis Test.

5

(a) Each participant either had a heart attack or didn’t,
and the doctors were all independent in that
respect. This is
binomial data. You’re testing the difference in proportions
between two populations,
Case 5 in Inferential Statistics: Basic Cases.

10n1 = 10×11,037 = 110,370.
According to
A Census of Actively Licensed Physicians in the United
States, 2010
(Young (2011) [see “Sources Used” at end of book]),
in that year there were 850,085 actively licensed physicians in the US.
Even if we assume half were women and there were fewer
doctors in 1982 when the study began, still 10n1 is lower.
10n2 = 10×11,034 = 110,340, also within the limit.

At the 0.001 level of significance, aspirin does make a difference to the likelihood of heart attack.
In fact it reduces it.
Or,Aspirin makes a difference to the likelihood of heart attack (p < 0.0001).
In fact, aspirin reduces the risk.

Remark
The study was conducted from 1982 to 1988 and was stopped early
because the results were so dramatic. For a non-technical summary,
see
Physicians’ Health Study (2009) [see “Sources Used” at end of book].
More details are in the
original article from the New England Journal of Medicine
(Steering Committee 1989 [see “Sources Used” at end of book]).

(b) 2-PropZInt with C-Level .95 gives (−.0125,
−.0056).
We’re 95% confident that 325 mg of aspirin every other day reduces the chance of heart attack by 0.56 to 1.25 percentage points.

Caution! You’re estimating the
change in heart-attack risk, not the risk of heart attack.
Saying something like “with aspirin, the risk of heart attack is
0.56 to 1.25%” would be very wrong.

June is 95% confident that the average house in Cortland County costs $20,004 less to $34,318 more than the average house in Broome County.

(b) A 95% confidence interval is the complement of a
significance test for ≠ at α = 0.05. Since 0
is in the interval, you know the p-value would be >0.05 and
therefore
June can’t tell, at the 0.05 significance level, whether there is any difference in average house price in the two counties or not.

If both ends of the interval were positive, that would
indicate a difference in averages at the 0.05 level, and you could say
Cortland’s average is higher than Broome’s. Similarly, if
both ends were negative you could say Cortland’s average is
lower than Broome’s. But as it is, nada.

Remark: Obviously Broome County is cheaper in the
sample. But the difference is not great enough to be
statistically significant. Maybe the true mean in Broome really is
less than in Cortland; maybe they’re equal; maybe
Broome is more expensive. You simply can’t tell from these
samples.

7

The immediate answer is that
those are proportions in the sample, not the proportions among all voters.
This is two-population binomial data, Case 5 in Inferential Statistics: Basic Cases.
Requirements check:

Random samples, OK.

Each sample 10n = 10×1000 = 10,000. There are
far more than 10,000 voters nationally; OK.

With 95% confidence, the Red candidate is somewhere between 0.4 percentage points behind Blue and 8.4 ahead of Blue.
The confidence interval contains 0, and so
it’s impossible to say whether either one is leading.

Remark:
Newspapers often report the sample proportions p̂1 and
p̂2 as though they were population proportions, but now you know
that they aren’t. A different poll might have similar results,
or it might have samples going the other way and showing Blue ahead of
Red.

8
(a) For a confidence interval, each sample must have at least 10
successes and at least 10 failures. Sample 1 has only 7
successes. Requirements are not met, and
you cannot compute a confidence interval with 2-PropZInt.

(b) For a hypothesis test, we often use “at least 10
successes and 10 failures in each sample” as a shortcut
requirements test, but the real requirement is at least 10
successes and 10 failures expected in each sample, using the
blended proportion p̂. If the shortcut procedure fails, you
must check the real requirement. In this problem, the blended
proportion is

p̂ =
(x1+x2)/(n1+n2) = (7+18)/(28+32) =25/60,
about 42%.

For sample 1, with
n1 = 28, you would expect 28×25/60 ≈
11.7 successes and 28−11.7 = 16.3 failures. For
sample 2, with n2 = 32, you would expect
32×25/60 ≈ 13.3 successes and
32−13.3 = 18.7
failures. Because all four of these expected numbers are at least 10,
it’s valid to compute a p-value using 2-PropZTest.