Math 207—Summary of
Large-Sample Confidence Intervals
Suppose we want
to estimate some parameter of a population. If we take a random sample, then we
can use a statistic from the sample as an estimate of the population parameter
(and it should be a good estimator). In general, confidence intervals take the
form
.Because of the Central Limit Theorem
(yeah!), for “large” samples the multiplier (for means or proportions) comes
from the standard normal distribution.
General
Concepts that Apply to all Confidence Intervals on this Handout
·
The
piece is always called the margin of error. This margin of error
takes into account only sampling variability (it does not take into account
error due to, for example, non-response or wording of survey questions—these
are things you must consider during the data collection process).
·
All else being equal, the width of a
confidence interval increases as the confidence level increases, and the width
of a confidence interval decreases as the samples size(s) increases.
·
The confidence
is always in the method used (not in the interval from our particular sample).
That is, if the sampling were (hypothetically) done repeatedly, then 95% (or
whatever your confidence percentage is) of the samples will produce confidence
intervals that contain the true value of the parameter being estimated. Put
another way, the confidence-interval-creation method is correct 95% (or
whatever your confidence percentage is) of the time.
Large-Sample
Confidence Interval for a Population Mean
Suppose we have a “large” (
),
random sample from a population with unknown mean
and
known standard deviation
.
Then an approximate
confidence
interval for
is
, where
is the z-value corresponding to an area
in the upper tail of the standard normal
distribution.
Typically (always?) the population standard deviation,
, is
unknown. We can estimate
using
, the
sample standard deviation. Since this estimation is based on a large sample, it
should be good and we needn’t make any other modifications. (Important note: Later we will discuss
the small-sample case where we must consider the estimation error of replacing
with
.)
Hence, the approximate
confidence
interval for
is typically
.
Large-Sample
Confidence Interval for a Population Proportion
Suppose we have a “large” (
) random
sample from a population with unknown “success” proportion,
.
(That is, we have a binomial, or approximate, binomial setting, and the success
proportion is unknown.) Recall that the sampling distribution of the sample proportion
of successes,
, is
approximately
. Since
is unknown in the standard deviation, we must
estimate it using
(again, based on a large sample, this should
be a good estimator and we need not make other adjustments).
Then
an approximate
confidence
interval for
is
, where
is the z-value corresponding to an area
in the upper tail of the standard normal
distribution.
Important
Note: It has been shown in the literature that this confidence
interval sometimes has “coverage” problems (e.g., a 95% confidence interval is
created, but the confidence level is actually only 90%). We will not discuss
these details in class, but know this particular confidence-interval method has
potential problems (don’t stress too much, though).
Large-Sample
Confidence Interval for a Difference in Population Means
Suppose we have two distinct populations with unknown means,
and
.
Furthermore, suppose we have “large” (
),
independent, random samples from each population.
When comparing two populations, we often want to compare the
population means—that is, estimate the difference in population averages. It
makes sense to use the difference in sample averages as our estimator. By the
Central Limit Theorem, we know
has an approximate
distribution,
and
has an approximate
distribution.
Also
and
are
independent, since they come from independent random samples (from different
populations). A nice property of the normal distribution is that sums and
differences of independent normal random variables are also normal. Yeah! Also,
recall from our previous discussion of means and variances that we know
and
. As in the one-sample case, we can use the
sample standard deviations as estimators of the unknown population standard
deviations.
Putting this all together, an
approximate
confidence
interval for
is
,
where
is the z-value
corresponding to an area
in the upper tail of the
standard normal distribution.
(Note
it doesn’t matter which way you do the differencing—that is, it doesn’t matter
which population you call “1” and which one we call “2”—as long as you
interpret the results appropriately.)
Large-Sample
Confidence Interval for a Difference in Population Proportions
Suppose we have two distinct populations with unknown “success”
proportions,
and
.
Furthermore, suppose we have large (
),
independent, random samples from each population. Then, reasoning in a similar
way to the confidence interval for a difference in means, an approximate
confidence
interval for
is
,
where
is the z-value
corresponding to an area
in the upper tail of the
standard normal distribution.
(Note
it doesn’t matter which way youe do the
differencing—that is, it doesn’t matter which population you call “1” and which
one we call “2”—as long as you interpret the results appropriately.)