Math 445—Other Types of Analyses
In ten
short weeks it’s difficult to cover the full range of standard statistical
analysis (particularly because we went through the theoretical details of many
of the procedures). Please realize, though, that statistical analysis is vast.
More importantly, you’re now equipped to handle this vastness. Based on your
current knowledge, you can easily understand the basics of new statistical
methods. In each setting, think about 1)
how the data are collected, 2) what
research questions are important, 3)
the appropriate statistical analysis (there may be multiple possibilities), 4) always look graphically and
numerically at your data first (and check any conditions of the analysis), and 5) then be able to capably, in
lay-person’s terms explain the results (which are typically given to you by a
statistical-software package).
Two types
of analyses that are written about in our textbook, but for which we don’t have
time to discuss in detail:
Non-Parametric Tests
Many
procedures we’ve discussed have “parametric” conditions (e.g., the population
being sampled from is normal). The bootstrap is a non-parametric method, and
there are many other non-parametric procedures (entire books are written on
this subject).
The
simplest example of a non-parametric test is called the Sign Test. We’ll do an
example of this test (not because it’s used particularly often, but because
it’s easy to understand and conveys the general idea of non-parametric tests).
Suppose
we have pre- and post-weight information of 10 people on a special diet. The
weight losses (in pounds) are 8, 5, 3, 5, -2, 25, 10, -3, 6, and 13. The
normality condition of the t-test is questionable in this case. We’ll use the
sign test:
Important Notes: The drawbacks of non-parametric tests is
that they are generally less powerful and you must often modify the statement
of your hypotheses to use a non-parametric test. But the positive is that you have a statistical test to use when, for
example, the conditions of a t-test aren’t met. Often, practicing statisticians
use a non-parametric test as double-check of a parametric procedure. For
example, if the normality condition of the t-test is questionable, then run the
corresponding non-parametric method. If the two answers agree, then you can
feel better about the t-test results. If they don’t agree, then you should feel
especially uneasy using the t-test.
Chi-Square Tests on Categorical Data
There are
many interesting research questions involving categorical data. For example, 1) are two variables associated, or 2) do the collected data fit a certain
“model” well?
In a
Chi-square test, the test statistic often takes the form
, where the summation
is done over all “cells” in the corresponding two-way table and the “expected
value” is what is expected if the null hypothesis is true. Big values of this
test statistic indicate evidence against the null hypothesis. The test
statistic often has a chi-squared distribution assuming the null hypothesis is
true (when certain sample size conditions are met).
Example: One possible effect of
air pollution is genetic damage. A study designed to examine this problem
exposed one group of mice to air near a steel mill and another group to air in
a rural area and compared the numbers of mutations in each group. Included
below are the data for a mutation at the Hm-2 gene locus:
|
|
Location |
|
|
|
|
Mutation |
Steel
Mill Air |
Rural
Air |
Total |
Marginal Proportion |
|
Yes |
30 |
23 |
53 |
(53/246)=0.215 |
|
No |
66 |
127 |
193 |
(193/246) = 0.785 |
|
Total |
96 |
150 |
246 |
|
|
Marg. Prop. |
(96/246)=0.390 |
(150/246)=0.610 |
|
|