Section 7.2 Solutions

 

7.53

a.      These data are discrete (only 6 possible values), so they cannot follow an exact normal curve (which is for continuous data).

 

b.      The combined sample size is very large, so by our rough rule, we can use the t-procedures even if the sample data distributions are strongly skewed. In this case, the only thing stopping us from using the t-procedures would be extreme outliers in the sample data – outliers are prevented, though, because there are only 7 possible values.

 

c.       Let  be the mean score for the intervention population, and let  be the mean score for the control population. The intervention program was designed to improve dietary behavior. If the program was designed well, then the research hypothesis should be one-sided. (If, for some reason, the program was not designed well, then it’s possible for the program to have a negative effect, so the one-sided alternative shouldn’t be used.) Assuming the program was designed well, we will test the hypotheses

 

 

d.      The test statistic is . To find the p-value for the test, we need to use the t-distribution with 165 – 1 = 164 degrees of freedom (there is no entry in Table D for 164 degrees of freedom, so we can use the closest value – 100). From Table D, we know p-value =  is less than 0.0005. In fact, since 6.258 is much greater than 3.390, we know our p-value is much less than 0.0005 (practically speaking, we can think of the p-value as being close to 0). Note: Although a t-curve picture isn’t included with this solution, I encourage you to draw one.

 

Assuming the mean scores are the same for the intervention and control populations, there is much less than a 0.0005 chance of observing our difference in sample means or a greater difference. Hence, we have very strong evidence that the mean score for the intervention population is, in fact, greater than the mean score for the control population (our results are significant at any reasonable significance level).

 

e.      Using 100 degrees of freedom (since 164 degrees of freedom isn’t included in the table), . The a 95% confidence interval for  is = (0.51, 0.99). We are 95% confident that the mean score for the intervention population is between 0.51 and 0.99 points higher than the mean score for the control population. Our confidence is in the method we use. If the sampling were (hypothetically) done repeatedly, then 95% of the intervals created would contain the true difference in population means.

 

Note that 0 is not included in this confidence interval. This tells us our results are statistically significant at the 1 – 0.95 = 0.05 level, if we test against the two-sided alternative.

 

f.        The low-income women of Durham, North Carolina may be very different from the low-income women of Chicago, Illinois. Hence, we should think carefully about what population these results generalize to (e.g., perhaps the results generalize to the population of low-income women in mid-sized cities in the South).

 

 

7.59

a.      The hypotheses should be statements about the two populations, expressed in terms of population parameters – in this case,  and . The given hypotheses are incorrect, because they are statements about the sample means, not the population means.

 

b.      The two-sample t test compares independent samples from two different populations. In this case, the samples being compared are not independent.

 

c.       A p-value of 0.96 is very high, which gives essentially no evidence against the null hypothesis. Based on this p-value, our data are very likely assuming the null hypothesis is correct, so we cannot reject the null hypothesis.

 

 

7.60

a.      The 95% confidence interval does not contain the null-hypothesized value of 0. Hence, these results are significant at the 1 – 0.95 = 0.05 level (i.e., we have enough evidence to conclude the population means are actually different).

 

b.      The sample sizes are included in the denominator of the standard error. Hence, increasing the sample sizes decreases the margin of error. (Intuitively, it makes sense that larger samples sizes provide more information and therefore provide more precise results.)

 

 


7.66

a.            The employees are labeled 01 – 20. Beginning at line 151 of the table of random digits (Table B), I get the following valid numbers: 03, 01, 12, 11, 09, 07, 20, 06, 05, and 16. These 10 employees receive flat screens.

 

b.            The combined sample size is 20. The ratings are expressed on a 5-point scale, and it’s unlikely that the sample distributions are strongly skewed or have any outliers. Hence, by our rough rule, we should be able to use the t procedures.

 

For 10 – 1 = 9 degrees of freedom,  (from Table D). Then a 95% confidence interval for  is  = (0.6, 3.0).

 

We are 95% confident that the mean rating of employees with flat screens is between 0.6 and 3.0 higher than the mean rating of employees with standard monitors.

 

c.             The null hypothesized value of 0 is not included in the 95% confidence interval. Hence, the results are significant at the 1 – 0.95 = 0.05 level, and we can reject the hypothesis that the mean ratings are the same.

 

 

7.72

The combined sample size is large, so we can use the t-procedures even if the sample data distributions are strongly skewed. The only other concern is extreme outliers, which probably won’t occur, since the responses are limited to a 7-point scale. Hence, we can use the t-procedures.

 

a.            Let  be the population mean response to the Wall Street Journal ad, and let  be the population mean response to the National Enquirer ad. Since we are given no further information about the research questions, we’ll perform a two-sided test. Then the hypotheses are

 

 

 

The test statistic is . To find the p-value for the test, we need to use the t-distribution with 61 – 1 = 60 degrees of freedom. From Table D, we know p-value =  is less than 0.001 (since the area to the right of 8.369 is less than 0.0005). In fact, since 8.369 is much greater than 3.460, we know our p-value is much less than 0.001. Note: Although a t-curve picture isn’t included with this solution, I encourage you to draw one.

 

Assuming the mean responses to the Wall Street Journal and National Enquirer ads are the same, there is much less than a 0.001 chance of observing our difference in sample means or a more extreme difference. Hence, we have very strong evidence that the mean responses are different.

 

b.            For 61 – 1 = 60 degrees of freedom,  (from Table D). Then a 95% confidence interval for  is  = (1.78, 2.9). Hence, we are 95% confident that the mean response to the Wall Street Journal ad is between 1.78 and 2.90 points higher than the mean response to the National Enquirer ad.

 

c.             Both the significance test and the confidence interval indicate that people, on average, think an ad in the Wall Street Journal is more trustworthy than an ad in the National Enquirer.

 

 

 

 

7.81

Note the combined sample size is large, so by our rough rule, we can use the t-procedures even if there is strong skewness in the sample data. (We should make sure there are no extreme outliers, though.)

 

a.      For 50 degrees of freedom (the closest thing to 54 in the table),  (from Table D). Then the 95% confidence interval for the difference in mean number of units sold at all retail stores is = (-3.06, 9.06).

 

      We are 95% confident that the mean number of units sold this month is between 3.06 units smaller and 9.06 units larger than the mean number of units sold this month last year.

 

b.      The 6% increase is based on sample data and these sample data are variable. It’s possible that the populations have the same mean, yet the sample means show a 6% increase. In fact, at the 5% level, we would not reject the null hypothesis that the population means are the same (based on the confidence interval in part a).

 

 

7.83

Let  be the mean hemoglobin level for all breast-fed babies and let  be the mean hemoglobin level for all formula-fed babies.

 

a.            The hypothesis are . Then the test statistic is . To find the p-value for the test, we need to use the t-distribution with 19 – 1 = 18 degrees of freedom. From Table D, we know p-value =  is between 0.05 and 0.10. Note: Although a t-curve picture isn’t included with this solution, I encourage you to draw one.

 

Assuming the mean hemoglobin level is the same for breast-fed and formula-fed babies, there’s between a 5% and 10% chance of getting our difference in sample means or a greater difference. Although these data provide some evidence against the mean levels being different, the results are not significant at the 0.05 level.

 

b.            For 19 – 1 = 18 degrees of freedom,  (from Table D). Then the 95% confidence interval for the mean difference in hemoglobin level is (-0.24, 2.04).

 

c.             We are assuming that we have independent simple random samples from the populations, and that the populations are normal. To check the normality assumption, we can use our rough rule. Since the combined sample size is large, we can use the t-procedures even in the presence of heavy skewness in the sample data distributions. Still, we should look at the sample distributions to ensure there aren’t extreme outliers.