Elementary Statistics – Solution to Paired t-test Example 3

 

Example 3 (Paired t-test)

A study was conducted on the effect of a special class designed to improve children’s verbal skills. Each of 41 children took a  verbal skills test twice, both before and after a 3-week period in the class. From the sample, the “after score – before score” differences have mean 0.645 and standard deviation 1.527.

 

Hypotheses

Note this is not a two-sample problem; it’s a paired-data problem. We have one sample of 41 children, with two measurements on each of them. The sample data have already been differenced and we’re provided with the sample mean and standard deviation of the differences.

 

Let be the average verbal improvement of the population of all children if they took the pre- and post-class test. Then we want to test the hypotheses 

 

[Note: In a paired t-test, the null hypothesis is always . In this case, the alternative is one-sided, because the special class was designed to improve children’s verbal skills.]

 

Check Conditions of the Test

We do not know the population standard deviation of differences, so we should use a paired t-test (not a z-test). Hence, we need to check the normality condition of the t-test. We aren’t given any information about the distribution of our sample data. Since we have a large (larger than 40) sample size, we can relax the normality condition and use the t-test even if the sample data are skewed. Still, we should find out if the sample-data distribution shows extreme outliers (or any other odd features), before we provide a final conclusion.

 

Test Statistic

The test statistic is . (That is, our sample average improvement is 2.705 standard errors above the null-hypothesized value of the population average.)

 

P-value

We must use the t-distribution with (41 – 1) = 40 degrees of freedom to find the P-value. From Table D, the P-value is (since it’s a one-sided test)

 

Definition of P-value and Conclusion

Definition of P-value: If the average verbal improvement of the population of all children is 0, then there’s only a 0.005 chance of getting our sample average improvement (0.645) or a larger sample average improvement. Conclusion: Because our data are so unlikely, we have strong evidence that the population average improvement is greater than 0. That is, these results are statistically significant at even the 0.01 significance level.

 

Practical Significance

These results are strongly statistically significant, but are they practically significant? The 95% t confidence interval for the population mean improvement is (0.163, 1.127). There are no units given, since the test is standardized, so it’s hard to gauge the practical importance, but this range of possible average improvements seems quite small (especially if the test is out of say, 20 or more questions). We need to talk to the teachers, but it seems like this might be a case of statistical significance, yet not practical significance.

 

Causality?

Can the significant improvement be solely attributed to the special class? No, because the experimental design did not include a control group (so there are many potential confounding variables, including the possible improvement simply based on taking the test a second time). This brings us back full-circle to data collection. How cool!