Math 207—Summary of Large-Sample Confidence Intervals

 

Suppose we want to estimate some parameter of a population. If we take a random sample, then we can use a statistic from the sample as an estimate of the population parameter (and it should be a good estimator). In general, confidence intervals take the form .Because of the Central Limit Theorem (yeah!), for “large” samples the multiplier (for means or proportions) comes from the standard normal distribution.

 

General Concepts that Apply to all Confidence Intervals on this Handout

·         The  piece is always called the margin of error. This margin of error takes into account only sampling variability (it does not take into account error due to, for example, non-response or wording of survey questions—these are things you must consider during the data collection process).

 

·         All else being equal, the width of a confidence interval increases as the confidence level increases, and the width of a confidence interval decreases as the samples size(s) increases.

 

·         The confidence is always in the method used (not in the interval from our particular sample). That is, if the sampling were (hypothetically) done repeatedly, then 95% (or whatever your confidence percentage is) of the samples will produce confidence intervals that contain the true value of the parameter being estimated. Put another way, the confidence-interval-creation method is correct 95% (or whatever your confidence percentage is) of the time.

 

Large-Sample Confidence Interval for a Population Mean

Suppose we have a “large” (), random sample from a population with unknown mean  and known standard deviation . Then an approximate  confidence interval for  is  , where  is the z-value corresponding to an area  in the upper tail of the standard normal distribution.

 

Typically (always?) the population standard deviation, , is unknown. We can estimate  using , the sample standard deviation. Since this estimation is based on a large sample, it should be good and we needn’t make any other modifications. (Important note: Later we will discuss the small-sample case where we must consider the estimation error of replacing  with .)

 

Hence, the approximate  confidence interval for  is typically  .

 

Large-Sample Confidence Interval for a Population Proportion

Suppose we have a “large” () random sample from a population with unknown “success” proportion, . (That is, we have a binomial, or approximate, binomial setting, and the success proportion is unknown.) Recall that the sampling distribution of the sample proportion of successes, , is approximately  . Since  is unknown in the standard deviation, we must estimate it using  (again, based on a large sample, this should be a good estimator and we need not make other adjustments).

 

Then an approximate  confidence interval for  is  , where  is the z-value corresponding to an area  in the upper tail of the standard normal distribution.

 

Important Note: It has been shown in the literature that this confidence interval sometimes has “coverage” problems (e.g., a 95% confidence interval is created, but the confidence level is actually only 90%). We will not discuss these details in class, but know this particular confidence-interval method has potential problems (don’t stress too much, though).

Large-Sample Confidence Interval for a Difference in Population Means

Suppose we have two distinct populations with unknown means,  and . Furthermore, suppose we have “large” (), independent, random samples from each population.

 

When comparing two populations, we often want to compare the population means—that is, estimate the difference in population averages. It makes sense to use the difference in sample averages as our estimator. By the Central Limit Theorem, we know  has an approximate  distribution, and  has an approximate  distribution. Also  and  are independent, since they come from independent random samples (from different populations). A nice property of the normal distribution is that sums and differences of independent normal random variables are also normal. Yeah! Also, recall from our previous discussion of means and variances that we know  and .  As in the one-sample case, we can use the sample standard deviations as estimators of the unknown population standard deviations.

 

Putting this all together, an approximate  confidence interval for    is

, where  is the z-value corresponding to an area  in the upper tail of the standard normal distribution.

 

(Note it doesn’t matter which way you do the differencing—that is, it doesn’t matter which population you call “1” and which one we call “2”—as long as you interpret the results appropriately.)

 

 

 

 

Large-Sample Confidence Interval for a Difference in Population Proportions

Suppose we have two distinct populations with unknown “success” proportions,  and . Furthermore, suppose we have large (), independent, random samples from each population. Then, reasoning in a similar way to the confidence interval for a difference in means, an approximate  confidence interval for    is

, where  is the z-value corresponding to an area  in the upper tail of the standard normal distribution.

 

(Note it doesn’t matter which way youe do the differencing—that is, it doesn’t matter which population you call “1” and which one we call “2”—as long as you interpret the results appropriately.)