7.53
a. These data are discrete (only 6 possible
values), so they cannot follow an exact normal curve (which is for continuous
data).
b. The combined sample size is very large, so by
our rough rule, we can use the t-procedures even if the sample data
distributions are strongly skewed. In this case, the only thing stopping us
from using the t-procedures would be extreme outliers in the sample data –
outliers are prevented, though, because there are only 7 possible values.
c. Let
be the mean score for
the intervention population, and let
be the mean score for
the control population. The intervention program was designed to improve
dietary behavior. If the program was designed well, then the research
hypothesis should be one-sided. (If, for some reason, the program was not
designed well, then it’s possible for the program to have a negative effect, so
the one-sided alternative shouldn’t be used.) Assuming the program was designed
well, we will test the hypotheses
![]()
d. The test statistic is
. To find the p-value
for the test, we need to use the t-distribution with 165 – 1 = 164 degrees of
freedom (there is no entry in Table D for 164 degrees of freedom, so we can use
the closest value – 100). From Table D, we know p-value =
is less than 0.0005.
In fact, since 6.258 is much greater than 3.390, we know our p-value is much less than 0.0005
(practically speaking, we can think of the p-value
as being close to 0). Note: Although
a t-curve picture isn’t included with this solution, I encourage you to draw
one.
Assuming the mean scores are the same for the
intervention and control populations, there is much less than a 0.0005 chance
of observing our difference in sample means or a greater difference. Hence, we
have very strong evidence that the mean score for the intervention population is,
in fact, greater than the mean score for the control population (our results
are significant at any reasonable significance level).
e. Using 100 degrees of freedom (since 164
degrees of freedom isn’t included in the table),
. The a 95% confidence interval for
is
= (0.51, 0.99). We are 95% confident that the mean score for
the intervention population is between 0.51 and 0.99 points higher than the
mean score for the control population. Our confidence is in the method we use.
If the sampling were (hypothetically) done repeatedly, then 95% of the
intervals created would contain the true difference in population means.
Note that 0 is not included in this
confidence interval. This tells us our results are statistically significant at
the 1 – 0.95 = 0.05 level, if we test against the two-sided alternative.
f.
The low-income
women of
7.59
a. The hypotheses should be statements about the
two populations, expressed in terms of population parameters – in this case,
and
. The given hypotheses are incorrect, because they are
statements about the sample means,
not the population means.
b. The two-sample t test compares independent samples from two different populations.
In this case, the samples being compared are not independent.
c. A p-value
of 0.96 is very high, which gives essentially no evidence against the null
hypothesis. Based on this p-value,
our data are very likely assuming the null hypothesis is correct, so we cannot
reject the null hypothesis.
7.60
a. The 95% confidence interval does not contain
the null-hypothesized value of 0. Hence, these results are significant at the 1
– 0.95 = 0.05 level (i.e., we have
enough evidence to conclude the population means are actually different).
b. The sample sizes are included in the
denominator of the standard error. Hence, increasing the sample sizes decreases
the margin of error. (Intuitively, it makes sense that larger samples sizes
provide more information and therefore provide more precise results.)
7.66
a.
The
employees are labeled 01 – 20. Beginning at line 151 of the table of random
digits (Table B), I get the following valid numbers: 03, 01, 12, 11, 09, 07,
20, 06, 05, and 16. These 10 employees receive flat screens.
b.
The combined
sample size is 20. The ratings are expressed on a 5-point scale, and it’s
unlikely that the sample distributions are strongly skewed or have any
outliers. Hence, by our rough rule, we should be able to use the t procedures.
For 10 – 1 = 9 degrees of freedom,
(from Table D). Then a
95% confidence interval for
is
= (0.6, 3.0).
We are 95% confident that the mean rating of
employees with flat screens is between 0.6 and 3.0 higher than the mean rating
of employees with standard monitors.
c.
The null hypothesized
value of 0 is not included in the 95% confidence interval. Hence, the results
are significant at the 1 – 0.95 = 0.05 level, and we can reject the hypothesis
that the mean ratings are the same.
7.72
The combined sample size is large, so we can use the t-procedures even if the sample data distributions are strongly
skewed. The only other concern is extreme outliers, which probably won’t occur,
since the responses are limited to a 7-point scale. Hence, we can use the t-procedures.
a.
Let
be the population mean
response to the Wall Street Journal
ad, and let
be the population mean
response to the National Enquirer ad.
Since we are given no further information about the research questions, we’ll
perform a two-sided test. Then the hypotheses are
![]()
The test statistic is
. To find the p-value
for the test, we need to use the t-distribution with 61 – 1 = 60 degrees of
freedom. From Table D, we know p-value
=
is less than 0.001
(since the area to the right of 8.369 is less than 0.0005). In fact, since 8.369
is much greater than 3.460, we know our p-value
is much less than 0.001. Note:
Although a t-curve picture isn’t included with this solution, I encourage you
to draw one.
Assuming the mean responses to the Wall Street Journal and National Enquirer ads are the same,
there is much less than a 0.001 chance of observing our difference in sample
means or a more extreme difference. Hence, we have very strong evidence that
the mean responses are different.
b.
For 61 – 1 =
60 degrees of freedom,
(from Table D). Then a
95% confidence interval for
is
= (1.78, 2.9). Hence,
we are 95% confident that the mean response to the Wall Street Journal ad is between 1.78 and 2.90 points higher than
the mean response to the National
Enquirer ad.
c.
Both the
significance test and the confidence interval indicate that people, on average,
think an ad in the Wall Street Journal
is more trustworthy than an ad in the National
Enquirer.
7.81
Note the combined sample size is large, so by our rough rule, we can use the t-procedures even if there is strong skewness in the sample data. (We should make sure there are no extreme outliers, though.)
a.
For 50
degrees of freedom (the closest thing to 54 in the table),
(from Table D). Then
the 95% confidence interval for the difference in mean number of units sold at
all retail stores is
= (-3.06, 9.06).
We are 95% confident that the mean number of units sold this month is between 3.06 units smaller and 9.06 units larger than the mean number of units sold this month last year.
b. The 6% increase is based on sample data and these sample data are variable. It’s possible that the populations have the same mean, yet the sample means show a 6% increase. In fact, at the 5% level, we would not reject the null hypothesis that the population means are the same (based on the confidence interval in part a).
7.83
Let
be the mean hemoglobin
level for all breast-fed babies and let
be the mean hemoglobin
level for all formula-fed babies.
a.
The hypothesis are
. Then the test statistic is
. To find the p-value for the test, we need to use the
t-distribution with 19 – 1 = 18 degrees of freedom. From Table D, we know p-value =
is between 0.05 and
0.10. Note: Although a t-curve
picture isn’t included with this solution, I encourage you to draw one.
Assuming the mean hemoglobin level is the same for breast-fed and formula-fed babies, there’s between a 5% and 10% chance of getting our difference in sample means or a greater difference. Although these data provide some evidence against the mean levels being different, the results are not significant at the 0.05 level.
b.
For 19 – 1 =
18 degrees of freedom,
(from Table D). Then the 95% confidence
interval for the mean difference in hemoglobin level is
(-0.24, 2.04).
c. We are assuming that we have independent simple random samples from the populations, and that the populations are normal. To check the normality assumption, we can use our rough rule. Since the combined sample size is large, we can use the t-procedures even in the presence of heavy skewness in the sample data distributions. Still, we should look at the sample distributions to ensure there aren’t extreme outliers.