Math 445—Two-Factor Analysis of Variance (ANOVA)

 

Recall our model-based expression of the one-factor ANOVA set-up:

, where the  are independent,  random variables

 

Now we’ll re-parameterize this model. Let  (this is the average overall response), and let  (measures the effect of the  treatment, as a departure from average overall response, ). Note that by the definition of the average overall response, .

 

Then , so our re-parameterized model is , where the  are independent,  random variables. [The first model expression has I parameters—I  values. Yet the new model statement has (I+1) parameters—1  and I  values. But , so only (I-1) of these  values are independently determined. Hence, the second model still has only I parameters that are independently determined.]

 

Based on this new parameterization (which is equivalent to the first parameterization),  becomes , or, more simply , since if all the  values are the same, they must been zero (recall ). What this null hypotheses says is there is only a overall average effect, but not treatment effect.

 

This type of model parameterization is especially helpful when thinking about a two-factor ANOVA. In a two-factor experiment, there is one response variable (the variable of interest), but now two factor/explanatory variables. (For example, the response variable might be decrease in blood-pressure and the two factors might be drug and diet, both of which might have multiple levels—e.g., drug/placebo, DietA/DietB.) Experimental units are randomly assigned to all treatments (to all combinations of the different factor levels). This is the simplest kind of two-factor experimental design (there are actually many types of experimental design).

 

Two-factor/Two-way Analysis-Of-Variance (ANOVA) Model (No interaction, no replications)

Let’s first consider a very simplistic two-factor model, where there is only one observation for each combination of factors levels (no replications) and there is no interaction between the factors. (Note: In practice, this is not at all realistic—we always have replications and we typically want to check for an interaction. This is simply an easy way to discuss the general two-factor idea before moving to more complicated modeling.) The two-factor model is

 

, where the  are independent,  random variables, and  and . (I is the number of levels of Factor A, and J is the number of levels of Factor B.)

 

Note that  is the overall mean response, the  values indicate the effect of Factor A (measured as a deviation from , and the  values indicate the effect of Factor B (measured as a deviation from .

 

Estimation

How can we estimate these model parameters from our data? We can simply use appropriate sample averages:

. Then the predicted response variable (based on our model) is  . Based on this predicted value, the residual for a specific observation is . Recall we can use the residuals all together to check conditions (normality and constant-variance) of our model, since the residuals are our estimates of the model errors.

 

Hypotheses

Now we have two sets of hypotheses (for Factor A and Factor B):

 against ( at least one of the  is different from 0)

 against ( at least one of the  is different from 0)

 


 

ANOVA Table

Source

df

Sum of Squares

Mean Square

f

P-value

Factor A

I – 1

Factor B

J – 1

Error

(I – 1)( J – 1)

 

 

Total

IJ – 1

 

 

 

 

(The distributional results are grungier in this two-factor model, but the general idea is exactly the same as for the one-factor model—that’s why we went through the one-factor derivations in such detail. Apply the general concepts from the one-factor model to this two-factor model.)

 

Conclusions

If statistically significant results are found in either factor, then Tukey’s method (which adjusts for multiple comparisons) can be applied to see exactly where the significant differences are. Then you should provide a conclusion in the context/words of the problem (and investigate practical significance, if there is indeed statistical significance).

 

Important Notes (for any two-factor analysis)

·         The normality condition can be checked by looking at graphs (histogram and normal-probability plot) of the residuals

·         In a two-factor experiment, the treatment groups are often small, so it’s more difficult to check the equal-variances condition using the sample standard deviations of the different groups. Instead, we can use the residuals (since the condition of equal-variances in our model is actually on the error terms). A standard residual plot is residuals (y-axis) versus predicted/fitted values (x-axis). Within this plot, look for relatively non-changing variation in the residuals. If the equal-variance condition appears to be violated and if there is a systematic change in variance (e.g., increasing or decreasing variation), then a transformation (e.g., log or square root) can be done on the response variable, and the ANOVA analysis rerun (this sometimes, but not always, works). If the equal-variances condition is violated and a transformation does not help, then ANOVA should not be used.

 

·         A block design is often employed when certain experimental units share something in common that affects the response variable. In a block design, units are first separated into blocks, and then random assignment of treatments occurs within each block. Then the blocking variable is considered one of the two factors in the ANOVA analysis.

 

The last model was simplistic because it assumed only one observation for each combination of factor levels. It’s good practice to replicate the experiment on many units within each combination of factor levels. Furthermore, the previous model was additive. Sometimes there is additionally an interaction between factors. That is, the effects of one factor change for different levels of another factor. This can be described visually via an interaction plot:

 

Two-factor ANOVA Model With Interaction

, where the  are independent,  random variables, and ,  ,  , and   (I is the number of levels of Factor A, J is the number of levels of Factor B, and K is the number of replications within each treatment. Also the  values indicate the interaction effect. Note: Here we assume an equal number of replications within each treatment, but realize Minitab can easily handle the situation when we do not have equal numbers of replications.)

 

The null hypotheses are

 

The ANOVA Table is

Source

df

Sum of Squares

Mean Square

f

P-value

Factor A

I – 1

SSA

Factor B

J – 1

SSB

Interaction

(I – 1)( J – 1)

SSAB

Error

IJ(K-1)

SSE

 

 

Total

IJK – 1

SST

 

 

 

 

The sums-of-squares and distributional results are grungier in this two-factor model, but the general idea is exactly the same as for the one-factor model—that’s why we went through the one-factor derivations in such detail. Apply the general concepts from the one-factor model to this two-factor model. Also, notice if we do not have replications, then error degrees of freedom is 0; that is, the interaction model can only be tested when there are replications in the experiment.)

 

Conclusions

Per usual, the residuals can be used to assess the conditions of normality and constant-variance.

 

If the interaction effect is statistically significant, then the interaction is often the interesting story—provide an interpretation of the results based on an interaction plot. (Additionally, if the main effects are statistically significant, they can be interpreted and discussed.)

 

If the interaction effect is not statistically significant, then the main effects can be tested and interpreted. If there is a significant main effect, then Tukey’s method can be used to see where exactly the difference(s) is.

 

 

Two-Factor ANOVA Example

An experiment is conducted to gauge the effect of both temperature setting and detergent type on the cleanliness of soiled t-shirts put through a washing cycle. Eighteen identically soiled t-shirts are randomly assigned to a treatment (that is, to a combination of temperature setting and detergent). Three wash-cycle temperatures are used: cold, warm, and hot. Two different detergents are considered: Detergent A and Detergent B. The response variable is a cleanliness rating (on a 1–10 scale).

 

Note there are a total of 6 treatments, and since there are 18 t-shirts, we have 3 replications within each treatment. On the next page is the output from Minitab’s general linear model procedure (which is equivalent to a two-factor ANOVA analysis). [In lab this week, we’ll discuss how to perform this analysis using Minitab.]

 


 

General Linear Model: Cleanliness Score versus Temperature, Detergent

Factor       Type   Levels  Values

Temperature  fixed       3  Cold, Hot, Warm

Detergent    fixed       2  A, B

 

Analysis of Variance for Cleanliness Score, using Adjusted SS for Tests

 

Source                 DF   Seq SS   Adj SS   Adj MS      F      P

Temperature             2  22.3333  22.3333  11.1667  14.36  0.001

Detergent               1  20.0556  20.0556  20.0556  25.79  0.000

Temperature*Detergent   2   0.7778   0.7778   0.3889   0.50  0.619

Error                  12   9.3333   9.3333   0.7778

Total                  17  52.5000

 

[Note: You can ignore the Seq SS column. The Adj SS column is simply the usual sum of squares we’ve discussed—it just has a different name within the general linear model procedure).

 

Before analyzing the results, we must check the normality and constant-variance conditions. Appropriate graphs of the residuals are shown below. The normality assumption is a bit shaky, but plausible. The residual plot shows a fairly constant variation in the residuals. Hence, both conditions seem to be plausible for these data.

 

The F-test for an interaction effect has a large P-value, which indicates there is no interaction effect. This can be seen visually via an interaction plot (also shown above).

 

Both main effects are significant. Hence, it’s appropriate to use Tukey’s procedure to get 95% simultaneous confidence intervals for the pairwise differences in means. Via Minitab:

 

Tukey 95.0% Simultaneous Confidence Intervals

Temperature = Cold subtracted from:

 

Temperature    Lower  Center  Upper  ------+---------+---------+---------+

Hot           1.3093  2.6667  4.024                         (-----*------)

Warm         -0.5240  0.8333  2.191               (------*------)

                                     ------+---------+---------+---------+

                                        -2.0       0.0       2.0       4.0

Temperature = Hot subtracted from:

 

Temperature   Lower  Center    Upper  ------+---------+---------+---------+

Warm         -3.191  -1.833  -0.4760  (------*------)

                                      ------+---------+---------+---------+

                                         -2.0       0.0       2.0       4.0

 

Tukey 95.0% Simultaneous Confidence Intervals

Detergent = A subtracted from:

 

Detergent  Lower  Center  Upper  ------+---------+---------+---------+

B          1.205   2.111  3.017  (-----------------*-----------------)

                                 ------+---------+---------+---------+

                                     1.50      2.00      2.50      3.00

 

Detergent B’s cleanliness rating is significantly higher, on average, than Detergent A. Furthermore, the cleanliness rating for hot-water wash is higher, on average, than for both the cold temperature and the warm temperature. Do you think these results are of practical importance?