Elementary Statistics—Inference for a Difference in Population Means

We spent class time developing the theory behind two-sample inference (and wasn’t it cool how it brought together many different things we learned all term?). To save time (and boredom), this is a handout that provides the general format of two-sample inference for means.

 

Setting

Suppose we have two distinct normal populations with unknown means,  and , and unknown standard deviations. Furthermore, suppose independent, random samples (of size  and ) are taken from the two populations.

 

Confidence Interval

A level C confidence interval for  is

Recall our estimate for the degrees of freedom is the minimum of () and ()—let’s just call this k for simplicity. So is the value on the T-distribution (with k degrees of freedom) that has area C between - and .

 

Per usual, our confidence is in the method we use, not in our one particular interval. Also, it does not matter which way you do the differencing in the confidence interval as long as you interpret it correctly.

 

Significance Test

The null hypothesis is always . To test this hypothesis, first calculate the test statistic: [Note this is simply a standardized value. Subtracting 0 has no impact numerically on the test statistic, but I included it for completeness sake, so you can see the similarity to other test statistics.]

 

Then determine the P-value using the T-distribution with k degrees of freedom. (Remember k is simply the minimum of () and ().) Also recall that the P-value depends on the direction of the alternative hypothesis.

 

Finally, define the P-value in the words of the problem and provide a conclusion (which might depend on a given value of significance, ). If you find statistical significance, then it’s a good idea to create a confidence interval to assess the practical significance.

 

Important Notes: