Math 445—Two-Factor Analysis of Variance
(ANOVA)
Recall
our model-based expression of the one-factor ANOVA set-up:
, where the
are independent,
random variables
Now we’ll re-parameterize this model. Let
(this is the average overall response), and
let
(measures the effect of
the
treatment, as a departure from average overall
response,
). Note that by the
definition of the average overall response,
.
Then
, so our
re-parameterized model is
, where the
are independent,
random variables. [The first model expression
has I parameters—I
values. Yet the new
model statement has (I+1) parameters—1
and I
values. But
, so only (I-1) of
these
values are independently determined. Hence,
the second model still has only I parameters that are independently
determined.]
Based
on this new parameterization (which is equivalent to the first
parameterization),
becomes
, or, more simply
, since if all the
values are the same, they must been zero
(recall
). What this null hypotheses says is there is only a overall average
effect, but not treatment effect.
This type
of model parameterization is especially helpful when thinking about a
two-factor ANOVA. In a two-factor
experiment, there is one response variable (the variable of interest), but
now two factor/explanatory variables.
(For example, the response variable might be decrease in blood-pressure and the
two factors might be drug and diet, both of which might have multiple levels—e.g.,
drug/placebo, DietA/DietB.)
Experimental units are randomly assigned to all treatments (to all combinations
of the different factor levels). This is the simplest kind of two-factor
experimental design (there are actually many types of experimental design).
Two-factor/Two-way Analysis-Of-Variance
(ANOVA) Model (No interaction, no replications)
Let’s
first consider a very simplistic
two-factor model, where there is only one observation for each combination of
factors levels (no replications) and there is no interaction between the
factors. (Note: In practice, this is
not at all realistic—we always have replications and we typically want to check
for an interaction. This is simply an easy way to discuss the general
two-factor idea before moving to more complicated modeling.) The two-factor
model is
, where the
are independent,
random variables, and
and
. (I is the number of levels of Factor A,
and J is the number of levels of Factor B.)
Note
that
is the overall mean response, the
values indicate the effect of Factor A
(measured as a deviation from
, and the
values indicate the effect of Factor B
(measured as a deviation from
.
Estimation
How
can we estimate these model parameters from our data? We can simply use
appropriate sample averages:
. Then the predicted
response variable (based on our model) is
. Based on this predicted value, the residual
for a specific observation is
. Recall we can use the
residuals all together to check conditions (normality and constant-variance) of
our model, since the residuals are our estimates of the model errors.
Hypotheses
Now
we have two sets of hypotheses (for Factor A and Factor B):
against (
at least one of the
is different from 0)
against (
at least one of the
is different from 0)
ANOVA Table
|
Source |
df |
Sum of Squares |
Mean Square |
f |
P-value |
|
Factor A |
I – 1 |
|
|
|
|
|
Factor B |
J – 1 |
|
|
|
|
|
Error |
(I – 1)( J – 1) |
|
|
|
|
|
Total |
IJ – 1 |
|
|
|
|
(The distributional
results are grungier in this two-factor model, but the general idea is exactly
the same as for the one-factor model—that’s why we went through the one-factor
derivations in such detail. Apply the general concepts from the one-factor
model to this two-factor model.)
Conclusions
If
statistically significant results are found in either factor, then Tukey’s method (which adjusts for multiple comparisons) can
be applied to see exactly where the significant differences are. Then you
should provide a conclusion in the context/words of the problem (and
investigate practical significance, if there is indeed statistical
significance).
Important Notes (for any two-factor analysis)
·
The
normality condition can be checked by looking at graphs (histogram and
normal-probability plot) of the residuals
·
In
a two-factor experiment, the treatment groups are often small, so it’s more
difficult to check the equal-variances condition using the sample standard deviations
of the different groups. Instead, we can use the residuals (since the condition
of equal-variances in our model is actually on the error terms). A standard
residual plot is residuals (y-axis) versus predicted/fitted values (x-axis).
Within this plot, look for relatively non-changing variation in the residuals.
If the equal-variance condition appears to be violated and if there is a systematic change in variance (e.g., increasing
or decreasing variation), then a transformation (e.g., log or square root) can
be done on the response variable, and the ANOVA analysis rerun (this sometimes,
but not always, works). If the equal-variances condition is violated and a
transformation does not help, then ANOVA should not be used.
·
A
block design is often employed when
certain experimental units share something in common that affects the response
variable. In a block design, units are first separated into blocks, and then
random assignment of treatments occurs within each block. Then the blocking variable
is considered one of the two factors in the ANOVA analysis.
The
last model was simplistic because it assumed only one observation for each
combination of factor levels. It’s good
practice to replicate the experiment on many units within each combination of
factor levels. Furthermore, the previous model was additive. Sometimes
there is additionally an interaction
between factors. That is, the effects of one factor change for different
levels of another factor. This can be described visually via an interaction
plot:

Two-factor ANOVA Model With Interaction
, where the
are independent,
random variables, and
,
,
, and
(I is the number of levels of Factor A, J is
the number of levels of Factor B, and K is the number of replications within
each treatment. Also the
values indicate the interaction effect. Note: Here we assume an equal number of
replications within each treatment, but realize Minitab can easily handle the
situation when we do not have equal numbers of replications.)
The null hypotheses are
![]()
![]()
![]()
The ANOVA Table is
|
Source |
df |
Sum of Squares |
Mean Square |
f |
P-value |
|
Factor A |
I – 1 |
SSA |
|
|
|
|
Factor B |
J – 1 |
SSB |
|
|
|
|
Interaction |
(I – 1)( J – 1) |
SSAB |
|
|
|
|
Error |
IJ(K-1) |
SSE |
|
|
|
|
Total |
IJK – 1 |
SST |
|
|
|
The sums-of-squares and
distributional results are grungier in this two-factor model, but the general
idea is exactly the same as for the one-factor model—that’s why we went through
the one-factor derivations in such detail. Apply the general concepts from the
one-factor model to this two-factor model. Also, notice if we do not have replications, then error degrees of freedom is 0;
that is, the interaction model can only be tested when there are replications
in the experiment.)
Conclusions
Per
usual, the residuals can be used to assess the conditions of normality and
constant-variance.
If
the interaction effect is statistically significant, then the interaction is
often the interesting story—provide an interpretation of the results based on
an interaction plot. (Additionally, if the main effects are statistically
significant, they can be interpreted and discussed.)
If
the interaction effect is not statistically significant, then the main effects
can be tested and interpreted. If there is a significant main effect, then Tukey’s method can be used to see where exactly the
difference(s) is.
Two-Factor ANOVA Example
An
experiment is conducted to gauge the effect of both temperature setting and
detergent type on the cleanliness of soiled t-shirts put through a washing
cycle. Eighteen identically soiled t-shirts are randomly assigned to a
treatment (that is, to a combination of temperature setting and detergent).
Three wash-cycle temperatures are used: cold, warm, and hot. Two different
detergents are considered: Detergent A and Detergent B. The response variable
is a cleanliness rating (on a 1–10 scale).
Note
there are a total of 6 treatments, and since there are 18 t-shirts, we have 3
replications within each treatment. On the next page is the output from
Minitab’s general linear model procedure (which is equivalent to a two-factor
ANOVA analysis). [In lab this week, we’ll discuss how to perform this analysis
using Minitab.]
General
Linear Model: Cleanliness Score versus Temperature, Detergent
Factor Type
Levels Values
Temperature fixed
3 Cold, Hot, Warm
Detergent fixed
2 A, B
Analysis of
Variance for Cleanliness Score, using Adjusted SS for Tests
Source DF Seq SS Adj SS Adj MS F
P
Temperature 2
22.3333 22.3333 11.1667
14.36 0.001
Detergent 1 20.0556
20.0556
20.0556
25.79 0.000
Temperature*Detergent 2
0.7778 0.7778 0.3889
0.50 0.619
Error 12 9.3333
9.3333
0.7778
Total 17 52.5000
[Note:
You can ignore the Seq SS column. The Adj SS column is simply the usual sum of squares we’ve
discussed—it just has a different name within the general linear model
procedure).
Before
analyzing the results, we must check the normality and constant-variance conditions.
Appropriate graphs of the residuals are shown below. The normality assumption
is a bit shaky, but plausible. The residual plot shows a fairly constant
variation in the residuals. Hence, both conditions seem to be plausible for
these data.



The
F-test for an interaction effect has a large P-value, which indicates there is
no interaction effect. This can be seen visually via an interaction plot (also
shown above).
Both main
effects are significant. Hence, it’s appropriate to use Tukey’s
procedure to get 95% simultaneous confidence intervals for the pairwise differences in means. Via Minitab:
Tukey 95.0% Simultaneous
Confidence Intervals
Temperature =
Cold subtracted from:
Temperature Lower
Center Upper ------+---------+---------+---------+
Hot 1.3093 2.6667
4.024
(-----*------)
Warm -0.5240 0.8333
2.191
(------*------)
------+---------+---------+---------+
-2.0 0.0 2.0
4.0
Temperature =
Hot subtracted from:
Temperature Lower
Center Upper ------+---------+---------+---------+
Warm -3.191
-1.833 -0.4760 (------*------)
------+---------+---------+---------+
-2.0 0.0 2.0
4.0
Tukey 95.0% Simultaneous
Confidence Intervals
Detergent = A
subtracted from:
Detergent Lower
Center Upper ------+---------+---------+---------+
B 1.205 2.111
3.017
(-----------------*-----------------)
------+---------+---------+---------+
1.50 2.00
2.50 3.00
Detergent
B’s cleanliness rating is significantly higher, on average, than Detergent A.
Furthermore, the cleanliness rating for hot-water wash is higher, on average,
than for both the cold temperature and the warm temperature. Do you think these
results are of practical importance?