Elementary Statistics—Inference for a Difference
in Population Means
We spent class time
developing the theory behind two-sample inference (and wasn’t it cool how it
brought together many different things we learned all term?). To save time (and
boredom), this is a handout that provides the general format of two-sample
inference for means.
Setting
Suppose we have two
distinct normal populations with unknown means,
and
, and unknown standard deviations. Furthermore, suppose
independent, random samples (of size
and
) are taken from the two populations.
Confidence Interval
A level C confidence
interval for
is

Recall our estimate for
the degrees of freedom is the minimum of (
) and (
)—let’s just call this k
for simplicity. So
is the value on the T-distribution (with k degrees of freedom) that has area C between -
and
.
Per usual, our confidence is in the method we use,
not in our one particular interval. Also, it does not matter which way you do
the differencing in the confidence interval as long as you interpret it
correctly.
Significance Test
The null hypothesis is always
. To test this hypothesis, first calculate the test statistic: 
[Note this is simply a standardized value. Subtracting 0 has
no impact numerically on the test statistic, but I included it for completeness
sake, so you can see the similarity to other test statistics.]
Then determine the P-value using the T-distribution with k degrees of freedom. (Remember k is simply the minimum of (
) and (
).) Also recall that the P-value
depends on the direction of the alternative hypothesis.
Finally, define the P-value in the words of the
problem and provide a conclusion (which might depend on a given value of
significance,
). If you find statistical significance, then it’s a good
idea to create a confidence interval to assess the practical significance.
Important Notes: