Elementary Statistics – Review Examples
Read each scenario,
and then decide on the proper statistical analysis. (Be sure to check any
assumptions of the analysis.)
- Ozone, a prevalent photochemical oxidant, has
been linked to forest decline and severe crop loss. A random sample of 50
ozone concentrations was obtained on Mt.
Mitchell in Yancey County, North Carolina.
An environmentalist is interested in estimating the mean ozone
concentration for this area.
- An Appleton
resident is interested in predicting the amount of energy consumed in a
home, based on the size of the home. He takes a sample of 30 homes, and
for each he measures the size (in square feet) and the energy consumed (in
kilowatt-hours per month).
- A fair coin is to be flipped 10 times. A student
wonders what the probability is of getting at least 8 heads.
- A group of 40 women in a large city were
given instructions on self-defense. Prior to the course, they were tested
to determine their self-confidence. After the course they were given the
same self-confidence test. The self-defense teacher wonders if
self-confidence scores are higher on average after taking the self-defense
course.
- A consumer research group sampled 100
hand-held video games, all of the same make and model. The group wants to
estimate the mean life span of the video games. Based on their previous
work, they know the standard deviation of the population of life spans is
35 hours.
- A doctor looks at 50 two-child families (one
boy, one girl). For each of the families she measures the heights of the
sister and the brother. She is interested in the association between the
two sets of heights.
- A fair coin is to be flipped 500 times. A
student wonders what the probability is of getting no more than 240 heads.
- A researcher has 50 rats. He randomly divides
them into two groups of 25: group 1 rats receive a drug and group 2 rats
receive a placebo. Then he measures the number of errors the rats make in
a run through a maze. He wonders if there is a difference between the two
groups, with regards to the average number of maze errors made.
Elementary Statistics – Answers to Review
Examples
- We can estimate the mean ozone concentration
with a confidence interval. Because we don’t know the population standard
deviation, we must use the one-sample t
confidence interval. The sample size is large (n = 50 > 40), so even if the sample distribution of ozone
concentrations is severely skewed, we can still use the t confidence interval (we should graph
the sample data to make sure there are no outliers, though). There are 49
degrees of freedom for the t
distribution.
- If we are interested in predicting one
variable from another, we can use a regression line (if appropriate).
Hence, we can create a scatterplot (y-variable: energy consumed; x-variable: size of house) and, if
appropriate, fit a regression line to the data.
- The number of heads in 10 flips of a fair
coin follows a binomial distribution with n = 10 and p = 0.5.
Hence, we can use Table C to find the probability of at least 8 heads (add
up the probabilities of 8, 9, and 10 heads).
- This is a matched pairs design and the
population standard deviation is unknown, so we should use a paired t-test. The sample size is large (n = 40), so even if the data
distribution of score differences is severely skewed, we can still use the
paired t-test (we should graph
the sample data to make sure there are no outliers, though). There are 39
degrees of freedom, and the alternative is one-sided.
- We can estimate the mean life span with a
confidence interval. Since the population standard deviation is known, we
should use the one-sample z confidence interval. The sample size is large
(n = 100), so the confidence
interval will be approximately correct, even if the original population
isn’t normal (since the central limit theorem tells us the distribution of
the sample mean is approximately normal).
- To determine the association between the
variables, we can create a scatterplot and
calculate the correlation (to measure the strength and direction of the
linear relationship).
- The number of heads in 500 flips of a fair
coin is binomial with n = 500
and p = 0.5. We can’t use Table
C to find probabilities (since n
> 20, and Table C doesn’t go beyond n
= 20), but we can use the normal approximation. Since np = 250 and n(1-p) = 250 (which are both at least
10), the normal approximation should be good. Then the number of heads in
500 flips follows an approximate normal distribution, and we can use this
distribution to calculate the probability of no more than 240 heads.
- This is a two-sample (not a paired) data
problem. The population standard deviations are unknown, so we should use
a two-sample t test. The
combined sample size (50) is large, so even if the data distributions of
maze errors are severely skewed, we can still use the t-test (we should graph the sample data to make sure there are
no outliers, though). There are 24 degrees of freedom (smaller sample size
minus one), and the alternative is two-sided.