Recall there
are two types of error in a significance test: 1) Rejecting the null hypothesis
when, in fact, it is true (Type I error), and 2) accepting the null hypothesis
when, in fact, it is false (Type II error). The probability of a Type I error
is denoted by
, and the probability of a Type II error
is denoted by
. [These are actually conditional
probabilities:
and
.
The power of a test is defined as
. That is, the power is the probability we reject the null hypothesis when, in
fact, it is false (this is a good thing). Hence, we’d like the power to be
high. (The gold standard for power is 0.8)
__________________________________________________________________________________________
To find the
power:
__________________________________________________________________________________________
Application
of these steps to a previous class example:
The mean yield
of corn in the
We
will reject the null hypothesis if the test statistic is smaller than –1.96 or
greater than 1.96. Therefore, the “acceptance” region is
, where z is the test statistic.


Example 1
What is the
normal body temperature? A 1992 JAMA article suggests that the average body
temperature may be less than 98.6 degrees Fahrenheit. A doctor plans to collect
a random sample of 40 body temperatures. (From past experience, she thinks the
standard deviation of body temperatures is
degrees.) She wants to test at the 0.05
significance level. Determine the power of the test of detecting a true mean of
98.4 degrees.
What are the
powers of detecting true means of 98.3, 98.2, and 98.1, respectively? We can
use these values to sketch the power curve. (Note that, not surprisingly, the
power increases as the specified value of the mean moves farther from the null
hypothesis.) What two other ways (besides considering an average body
temperature farther into the alternative hypothesis) are there to increase the
power of a test?
Example 2 (Sample-size determination)
Suppose the
average weekly salary for women in managerial and professional positions is
$670. A researcher thinks that men in the same types of positions have average
weekly salaries that are higher than $670. The researcher plans to randomly
sample men in managerial and professional positions and record their weekly
earnings. He plans to test at the 0.01 significance level, and with power 0.85
he’d like to be able to detect a true mean of $685. How large of a sample
should he take? (Assume the standard deviation of the salaries for all men in
managerial and professional positions is $100.)