Math 445—Other Types of Analyses

In ten short weeks it’s difficult to cover the full range of standard statistical analysis (particularly because we went through the theoretical details of many of the procedures). Please realize, though, that statistical analysis is vast. More importantly, you’re now equipped to handle this vastness. Based on your current knowledge, you can easily understand the basics of new statistical methods. In each setting, think about 1) how the data are collected, 2) what research questions are important, 3) the appropriate statistical analysis (there may be multiple possibilities), 4) always look graphically and numerically at your data first (and check any conditions of the analysis), and 5) then be able to capably, in lay-person’s terms explain the results (which are typically given to you by a statistical-software package).

 

Two types of analyses that are written about in our textbook, but for which we don’t have time to discuss in detail:

 

Non-Parametric Tests

Many procedures we’ve discussed have “parametric” conditions (e.g., the population being sampled from is normal). The bootstrap is a non-parametric method, and there are many other non-parametric procedures (entire books are written on this subject).

 

The simplest example of a non-parametric test is called the Sign Test. We’ll do an example of this test (not because it’s used particularly often, but because it’s easy to understand and conveys the general idea of non-parametric tests).

 

Suppose we have pre- and post-weight information of 10 people on a special diet. The weight losses (in pounds) are 8, 5, 3, 5, -2, 25, 10, -3, 6, and 13. The normality condition of the t-test is questionable in this case. We’ll use the sign test:

 

 

 

 

 

 

 

Important Notes: The drawbacks of non-parametric tests is that they are generally less powerful and you must often modify the statement of your hypotheses to use a non-parametric test. But the positive is that you have a statistical test to use when, for example, the conditions of a t-test aren’t met. Often, practicing statisticians use a non-parametric test as double-check of a parametric procedure. For example, if the normality condition of the t-test is questionable, then run the corresponding non-parametric method. If the two answers agree, then you can feel better about the t-test results. If they don’t agree, then you should feel especially uneasy using the t-test.

 

Chi-Square Tests on Categorical Data

There are many interesting research questions involving categorical data. For example, 1) are two variables associated, or 2) do the collected data fit a certain “model” well?

 

In a Chi-square test, the test statistic often takes the form , where the summation is done over all “cells” in the corresponding two-way table and the “expected value” is what is expected if the null hypothesis is true. Big values of this test statistic indicate evidence against the null hypothesis. The test statistic often has a chi-squared distribution assuming the null hypothesis is true (when certain sample size conditions are met).

 

Example: One possible effect of air pollution is genetic damage. A study designed to examine this problem exposed one group of mice to air near a steel mill and another group to air in a rural area and compared the numbers of mutations in each group. Included below are the data for a mutation at the Hm-2 gene locus:

 

Location

 

 

Mutation

Steel Mill Air

Rural Air

Total

Marginal Proportion

Yes

30

23

53

(53/246)=0.215

No

66

127

193

(193/246) = 0.785

Total

96

150

246

 

Marg. Prop.

(96/246)=0.390

(150/246)=0.610