Math 445 – Assignment 4 Solutions
CH8.20 (WA; 5 points)
We want to
estimate the proportion of all American young people (ages 6 to 19 years old)
who are seriously overweight. Furthermore, we’ll do this estimation with a 99%
confidence interval. Because this is a confidence interval for a population
proportion, we must be careful with the particular CI we choose (recall, the
usual Wald interval has coverage problems and the score interval is much more
accurate). But, in this case, the sample size is very large (n = 4722), so the
Wald and Score intervals will probably be very close to each other (see the
argument on page 388 of the textbook about why this happens for large n). We’ll
determine both intervals.
Score Interval
For 99%
confidence,
. Also, for this particular sample,
.
Then, using the
formula for the score interval, the lower endpoint of the confidence interval
is
. (Within these calculations, you can
that in the very-large n case, the score interval (practically speaking)
reduces to the Wald interval.)
And the upper
endpoint is 
So the Score
confidence interval is (0.137, 0.164).
Wald Interval
The Wald 99%
confidence interval is ![]()
Note these intervals are essentially the
same, and to two decimal places they are exactly the same.
Intepretation:
We are 99%
confident that the proportion of all American kids who are seriously overweight
is between 0.14 and 0.16. Our confidence is in the method we used to create
this interval. That is, if the sampling were (hypothetically) done repeatedly,
then 99% of the confidence intervals created would contain the true proportion
of seriously overweight kids. (We hope this is one of those times!)
CH8.28 (WT; 10 points)
We
are 99% confident that the mean weight of backpacks carried by all sixth
graders is between 12.69 pounds and 14.97 pounds. Our confidence is in the method
we used to create this interval. That is, if the sampling were (hypothetically)
done repeatedly, then 99% of the confidence intervals created would contain the
true mean backpack weight of all sixth graders. (We hope this is one of those
times!)
For
a 99% confidence interval, the only thing that changes is the z value (from
1.96 to 2.575). Hence, the 99% confidence interval is
(13.26%, 16.25%).
CH8.34 (WA; 5 points)

The
histogram shows a mostly mound-shaped distribution (with the exception of the
interval at 28), and the normality plot doesn’t show any big deviations from
normality. Hence, it’s plausible that these data come from a normal
distribution. (The t confidence interval has a condition of normality, so it’s
important that we check.)
Variable N
Mean StDev Minimum
Q1 Median Q3
Maximum
ACT
score 20
25.050 2.690 19.750
23.313 24.750 27.563
30.000
For
95% confidence,
. So a 95% confidence interval for the average ACT
score of all college freshmen in calculus is
(23.79, 26.31).
CH8.43 (WT; 5 points)
Assume that
is a random sample from a
distribution. We showed in class that
has a standard
normal distribution. Furthermore, we’ve previously shown that
has a Chi-squared distribution with (n-1) degrees of
freedom. Finally, we know that
and
are independent random variables (and since
is a new observation, it’s also independent of
). Then we know the following quantity has a t
distribution with (n-1) degrees of freedom (since it’s the ratio of independent random variables—a N(0,1)
r.v. divided by the square root of a Chi-squared r.v. divided by it’s degrees
of freedom):
. There
is much cancellation in this quantity (the (n-1)s cancel, the sigmas cancel),
and it
reduces to
. Hence, T (on which the prediction interval is based)
has a t distribution with (n-1) degrees of freedom.
CH8.50 (WT; 5 points)


CH8.76 (WA; 10 points)
![]()
and
![]()
Variable N
Mean StDev Minimum
Q1 Median Q3
Maximum
Time
(in minutes) 10 34.45
4.29 28.70 30.88
34.35 37.73 42.00
For
99.8% confidence and 9 degrees of freedom, the corresponding t value is
. Then a 99.8% t confidence interval for the
mean/median is
(28.62 minutes, 40.28 minutes).
Important Notes: The t interval is slightly narrower.
This is because the t interval makes use of the normality condition (which
seems reasonable in this case). If the normality condition isn’t met, though,
the non-parametric approach in part c
would be better (since the t interval is no longer a confidence interval
for the median).
CH9.7 (WT; 5 points)
A Type I error
occurs if we conclude the power plant is non-compliant with regulations, when,
in fact, the plant really is compliant. A Type II error occurs if we do not
have evidence of non-compliance, yet the plant really is breaking the
regulations. Reasonable arguments can be made that the more serious error is
either of these (depending on your personal views). If you think the Type II error is most serious, then you can
reconstruct the test accordingly:
and
. With this reconstructed test, a Type I
error occurs if we conclude the power plant is compliant, when, in fact, the
plant really isn’t. Remember we can then “control” this error rate by using a
small significance level.
CH9.25 (WA; 5 points)
Statement of hypotheses
Suppose
is the mean IQ score for all first-graders in
this school. We want to test the hypotheses
and
.
Check of conditions
For this
problem, we are told the distribution of first-grade IQ scores for this school
follows a normal distribution with population standard deviation,
. Then we can use a z-test (since we know
the distribution of the sample mean is exactly normal).
Calculation of the Test Statistic
For this sample
of 10 IQ scores,
.
So our test
statistic is
. That is, our particular sample average
is 3.37 standard errors above the null-hypothesized mean. Note: A z-distribution picture should be included with this solution
(the only reason it isn’t is because Word cannot draw it).
Calculation of the P-value
This is a
two-sided test, so “more extreme” than our test statistic is both above 3.37
and below -3.37 on a standard-normal distribution. Hence,
.
Interpretation of the Results in the
Context of the Problem
Assuming the
average IQ score for first-graders at this school is 100, there is only a
0.0008 chance of getting our particular sample average IQ or a more extreme
average IQ. This very surprising and provides very strong evidence that the
average IQ score for this school is different from the national average. Clearly,
these results are statistically significant at the 0.05 significance level
(since our p-value is so much smaller than 0.05).
But are these
results practically significant? A 95% confidence interval for the average IQ
score for first graders at this school is
. The IQ test is standardized, so the
scores include no units. This interval seems, practically speaking,
substantially higher than 100. But an IQ-test expert should be consulted to
verify that these results are of practical importance.
CH9.27 (WA; 10 points)
Statement of hypotheses
Suppose
is the population mean weight for all
Pepperidge Farm bagels. We want to test the hypotheses
and
.
Check of conditions
Since the
population standard deviation is unknown, we must use a t-test, but this test
has the condition that the population from which we sampled follows a normal
distribution. For such a small sample size, it’s especially important for us to
check this condition. That said, because there are only 6 radon readings, it’s very
difficult to tell if they seem to follow a normal distribution (such is the
life of a practicing statistician!). Included below are a dotplot and
normal-probability plot of the sample radon readings. The dotplot indicates
some deviation from normality (two of the observations seem separated from the
others), but the normality-plot doesn’t indicate a significant deviation from
normality. Hence, we can tentatively proceed with a t-test and feel reasonably
good about the conclusions.

Calculation of the Test Statistic
Numerical
summaries provided by Minitab:
Variable
N Mean StDev
Bagel Weights (in grams) 6
112.97 4.29
From our
sample, the test statistic is
. That is, our particular sample average
is only 0.02 estimated standard errors below the null-hypothesized mean. Even
for a t-distribution (with “fatter tails” than the standard normal), this does
not seem surprising. Note: A
t-distribution picture should be included with this solution (the only reason
it isn’t is because Word cannot draw it).
Calculation of the P-value
We know our
test statistic has a t-distribution with
degrees
of freedom. This is a one-sided test, so “more extreme” than our test statistic
is only below -0.02 on a t-distribution with 5 df. Hence,
. Note that 0.02 is not listed on Table
A.5, but from Table A.5, we can say this p-value is much greater than 0.10 (Minitab can give us the exact p-value: 0.49).
Interpretation of the Results in the
Context of the Problem
Assuming the
average weight of all Pepperidge Farm bagels is 113 grams, there is a 49%
chance of getting our particular sample average weight or a more extreme average
weight. This is not at all surprising and provides no evidence that the average
weight is smaller than 113 grams. The results are not statistically significant
at any reasonable significant level. (Since the results are not statistically
significant, we don’t need to explore the practical importance.)
Part b
Now suppose we
know the population of bagel weights follows a normal distribution with
. Then we can use a z-test, not a t-test. We
want to perform a one-sided test:
and
. We want to test at the 0.05
significance level. Assuming the true mean bagel weight is 110 grams, what is
the probability that our test rejects the null hypothesis, if our test is based
on only 6 observations?
Note: Normal curve pictures should be
included with this solution (the only reason they aren’t is because Word cannot
draw them).
First we must
quantify what it means to “reject
”(recall this depends on the
significance level and the alternative hypothesis). We reject only for small
values of the sample average (and our significance level is 0.05). Then we
“reject
” when our test statistic, z, is less
than -1.645.
In terms of
we “reject
” when
.
Then,
Note: This power, 0.5753, isn’t very
high—mainly because there are so few observations in the sample.
Part c
Same conditions
as in part b, but now we want to determine how large our sample must be in
order to bring the power up to 0.95.
Note: Normal curve pictures should be
included with this solution (the only reason they aren’t is because Word cannot
draw them).
First we must
quantify what it means to “reject
”(recall this depends on the
significance level and the alternative hypothesis). We reject only for small
values of the sample average (and our significance level is 0.05). Then we
“reject
” when our test statistic, z, is less
than -1.645.
In terms of
we “reject
” when
.
From this we
can determine the power:

We want this power
to be 0.95. From the standard normal table (Table A.3), we know
.
Hence, we must
find n such that
.
So they only need to weigh a sample of
20 bagels in order to have a power of 0.95 to detect a true mean weight of 110
grams. (So they need 14 additional bagels in order to raise the power from 0.58
to 0.95.)
CH9.32 (WT; 10 points)
Statement of hypotheses
Suppose
is the population mean reading for all radon
detectors of this type. We want to test the hypotheses
and
.
Check of conditions
Since the
population standard deviation is unknown, we must use a t-test, but this test
has the condition that the population from which we sampled follows a normal
distribution. For such a small sample size, it’s especially important for us to
check this condition. That said, because there are only 12 radon readings, it’s
difficult to tell if they seem to follow a normal distribution (such is the
life of a practicing statistician!). Included below are a dotplot and
normal-probability plot of the sample radon readings. Neither indicates a
deviation from normality. Hence, we can proceed with a t-test and feel
reasonably good about the conclusions.


Calculation of the Test Statistic
Numerical
summaries provided by Minitab:
Variable
N Mean StDev
Radon Reading (pCi/L) 12
98.37 6.11
From our
sample, the test statistic is
. That is, our particular sample average
is 0.924 estimated standard errors below the null-hypothesized mean. Even for a
t-distribution (with “fatter tails” than the standard normal), this does not
seem surprising. Note: A t-distribution
picture should be included with this solution (the only reason it isn’t is
because Word cannot draw it).
Calculation of the P-value
We know our
test statistic has a t-distribution with
degrees
of freedom. This is a two-sided test, so “more extreme” than our test statistic
is both below -0.924 and above 0.924 on a t-distribution with 11 df. Hence,
. Note that 0.924 is not listed on Table
A.5, but from Table A.5, we can say this p-value is greater than 2(0.10)=0.20
(Minitab can give us the exact p-value: 2(0.187)=0.374).
Interpretation of the Results in the
Context of the Problem
Assuming the
average radon reading for all detectors of this type is 100 pCi/L, there is a
37.4% chance of getting our particular sample average reading or a more extreme
average reading. This is not at all surprising and provides no evidence that
the average reading is different from 100 pCi/L. The results are not
statistically significant at any reasonable significant level. (Since the
results are not statistically significant, we don’t need to explore the
practical importance.)
Part b
Now suppose we
know the population of radon readings follows a normal distribution with
. Then we can use a z-test, not a t-test.
Furthermore, suppose we want to perform a one-sided test:
and
. And we want the Type II error rate to
be only 0.10 when the true mean radon reading is 95 pCi/L (that is, we want a
high power of 0.90 to detect this particular deviation from the null
hypothesis).
Note: Normal curve pictures should be
included with this solution (the only reason they aren’t is because Word cannot
draw them).
First we must
quantify what it means to “fail to reject
”(recall this depends on the
significance level and the alternative hypothesis). We reject only for small
values of the sample average (and our significance level is 0.05). Then we
“fail to reject
” when our test statistic, z, is larger
than -1.645.
In terms of
we “fail to reject
” when
.
From this we
can determine the Type II error rate,
:

We want this
error rate to be only 0.10. From the standard normal table (Table A.3), we know
.
Hence, we must
find n such that
.
So they only need to test 20 radon
detectors in order to have a power of 0.9 to detect a true mean reading of 95
pCi/L.