d <- tibble::tribble(
~id, ~group, ~X, ~Z, ~Rx, ~condition, ~Y,
1L, "Gb", "102", 1L, "Placebo", "Ca", "584.07",
2L, "Ga", "52", 1L, "Placebo", "Cb", "790.29",
3L, "Gb", "134", 2L, "Dose100", "Ca", "875.76",
4L, "Gb", "128", 3L, "Dose100", "Cb", "848.37",
5L, "Ga", "78", 1L, "Dose250", "Ca", "270.42",
6L, "Gb", "150", 2L, "Dose250", "Cb", "999.87",
7L, "Ga", "73", 1L, "Placebo", "Ca", "364.1",
8L, "Ga", "87", 7L, "Placebo", "Cb", "420.84",
9L, "Gb", "115", 6L, "Dose100", "Ca", "335.78",
10L, "Gb", "113", 4L, "Dose100", "Cb", "627",
11L, "Gc", "148", 3L, "Dose250", "Ca", "607.79",
12L, "Gc", "82", 3L, "Dose250", "Cb", "329.32",
13L, "Ga", "139", 1L, "Placebo", "Ca", "335.56",
14L, "Ga", "65", 2L, "Placebo", "Cb", "669.04",
15L, "Gb", "139", 1L, "Dose100", "Ca", "405.04",
16L, "Gc", "96", 1L, "Dose100", "Cb", "367.15",
17L, "Gb", "50", 5L, "Dose250", "Ca", "27.37",
18L, "Gc", "90", 2L, "Dose250", "Cb", "468.69",
19L, "Ga", "90", 2L, "Placebo", "Ca", "584.67",
20L, "Ga", "116", 2L, "Placebo", "Cb", "277.71",
21L, "Gb", "78", 2L, "Dose100", "Ca", "266.01",
22L, "Gb", "60", 1L, "Dose100", "Cb", "0.04",
23L, "Gc", "112", 4L, "Dose250", "Ca", "593.25",
24L, "Ga", "63", 4L, "Dose250", "Cb", "512.26",
25L, "Ga", "89", 1L, "Placebo", "Ca", "635.57",
26L, "Ga", "97", 2L, "Placebo", "Cb", "468.69",
27L, "Gc", "76", 3L, "Dose100", "Ca", "514.66",
28L, "Gb", "83", 1L, "Dose100", "Cb", "264.87",
29L, "Gc", "84", 4L, "Dose250", "Ca", "220.34",
30L, "Gb", "88", 1L, "Dose250", "Cb", "216.54"
)
Analysis of variance (ANOVA) is a statistical procedure, developed by R. A. Fisher, used to analyze the relationship between a continuous outcome (dependent variable) and categorical predictors (independent variables). This procedure produces a linear model, which can be used to estimate the conditional and marginal means of the outcome; In the presence of multiple factorial predictors, the model will include all interaction terms between the factors.
The ANOVA is part of a wider family of statistical procedures that include ANCOVA (which incorporates continuous predictors) and Analysis of Deviance (which allow for non-continuous outcomes1). This family of procedures all produce an ANOVA table (or ANOVA-like table) which summarizes the relationship between the underlying model and the outcome by partitioning the variation in the outcome into components which can be uniquely attributable to different sources according to the law of total variance. Essentially, each of the model’s terms is represented in a line in the ANOVA table which answers the question how much of the variation in \(Y\) can be attributed to the variation in \(X\)??2 Where applicable, each source of variance has an accompanying test statistic (often F), sometimes called the omnibus test, which indicates the significance of the variance attributable to that term, often accompanied by some measure of effect size.
In this post I will demonstrate the various types of ANOVA tables, how R does ANOVA (what the defaults are, and how to produce alternatives).
Along the way, I hope to illustrate the applicability of ANOVA tables to other types of models - besides the classical case of a maximal model (all main effects and interactions) with strictly categorical predictors and a continuous outcome.
Here are some assumptions I make about you, the reader, in this post:
- You’re familiar with the ideas of multi-factor ANOVAs (what a main effect is, what interactions are…).
- You know some
R
- how to fit a linear model, how to wrangle some data. You are IID and normally distributed.
Let’s dive right in!
ANOVAs in R
Toy Data
$id <- factor(d$id)
d$group <- factor(d$group)
d$Rx <- factor(d$Rx, levels = c("Placebo", "Dose100", "Dose250"))
d$condition <- factor(d$condition) d
::glimpse(d) dplyr
#> Rows: 30
#> Columns: 7
#> $ id <fct> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
#> $ group <fct> Gb, Ga, Gb, Gb, Ga, Gb, Ga, Ga, Gb, Gb, Gc, Gc, Ga, Ga, Gb, …
#> $ X <dw_trnsf> 102, 52, 134, 128, 78, 150, 73, 87, 115, 113, 148, 82, …
#> $ Z <int> 1, 1, 2, 3, 1, 2, 1, 7, 6, 4, 3, 3, 1, 2, 1, 1, 5, 2, 2, 2, …
#> $ Rx <fct> Placebo, Placebo, Dose100, Dose100, Dose250, Dose250, Placeb…
#> $ condition <fct> Ca, Cb, Ca, Cb, Ca, Cb, Ca, Cb, Ca, Cb, Ca, Cb, Ca, Cb, Ca, …
#> $ Y <dw_trnsf> 584.07, 790.29, 875.76, 848.37, 270.42, 999.87, 364.10,…
<- lm(Y ~ group + X, data = d) m
This is a multiple regression model with a covariable X
and a 3-level factor group
. We can summarize the results in a coefficient table (aka a “regression” table):
summary(m)
#>
#> Call:
#> lm(formula = Y ~ group + X, data = d)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -372.51 -166.47 -53.67 134.57 451.16
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 118.608 145.557 0.815 0.42256
#> groupGb -102.591 94.853 -1.082 0.28937
#> groupGc -92.384 107.302 -0.861 0.39712
#> X 4.241 1.504 2.820 0.00908 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 218.8 on 26 degrees of freedom
#> Multiple R-squared: 0.2383, Adjusted R-squared: 0.1504
#> F-statistic: 2.711 on 3 and 26 DF, p-value: 0.06556
Or we can produce an ANOVA table:
anova(m)
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> group 2 8783 4391 0.0918 0.912617
#> X 1 380471 380471 7.9503 0.009077 **
#> Residuals 26 1244265 47856
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
While the group
term had 2 parameters in the coefficient table, it now has a single test with a Df
of 2. This omnibus test can be thought of as representing the total significance of the two parameters combined!
Note that the model being summarized here is neither completely factorial (X
is a continuous covariable), nor maximal (the group:X
interaction is missing)!
By default R
calculates type 1 sums of squares (SS) - these are also called sequential SS, because each term is attributed with a portion of the variation (represented by its SS) in \(Y\) that has not yet been attributed to any of the PREVIOUS terms! Thus, in our example, the effect of X
represents only what X
explains on top of what group
has already explained - the variance attributed to X
is strictly the variance that can be uniquely attributed to X
, controlling for group
; the effect of group
however does not represent its unique contribution to \(Y\) ’s variance, but instead its total contribution.
This means that although the following models have the same terms, they will produce different type 1 ANOVA tables because those terms are in a different order:
anova(lm(Y ~ group + X, data = d))
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> group 2 8783 4391 0.0918 0.912617
#> X 1 380471 380471 7.9503 0.009077 **
#> Residuals 26 1244265 47856
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(Y ~ X + group, data = d))
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> X 1 325745 325745 6.8067 0.01486 *
#> group 2 63509 31754 0.6635 0.52353
#> Residuals 26 1244265 47856
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can recreate the ANOVA table above by building a sequence of models, and comparing them (see (Judd, McClelland, & Ryan, 2017)[https://doi.org/10.4324/9781315744131]):
<- lm(Y ~ 1, data = d) # Intercept-only model
m0 <- lm(Y ~ group, data = d)
m1
anova(m0, m1, m)
#> Analysis of Variance Table
#>
#> Model 1: Y ~ 1
#> Model 2: Y ~ group
#> Model 3: Y ~ group + X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 29 1633519
#> 2 27 1624737 2 8783 0.0918 0.912617
#> 3 26 1244265 1 380471 7.9503 0.009077 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m) # Same SS values
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> group 2 8783 4391 0.0918 0.912617
#> X 1 380471 380471 7.9503 0.009077 **
#> Residuals 26 1244265 47856
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Simultaneous Sum of Squares
There is also type 2 SS - also called simultaneous SS, because each term is attributed with a portion of the variation in \(Y\) that is not attributable to any of the other terms in the model - its unique contribution while controlling for the other terms. Type 2 SS can be obtained with the Anova()
function from the {car
} package:
::Anova(m, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> group 63509 2 0.6635 0.523533
#> X 380471 1 7.9503 0.009077 **
#> Residuals 1244265 26
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can recreate the above ANOVA table by building two sequences of models, and comparing them:
<- lm(Y ~ group, data = d)
m_sans_X <- lm(Y ~ X, data = d)
m_sans_group
anova(m_sans_group, m) # Same SS as the type 2 test for group
#> Analysis of Variance Table
#>
#> Model 1: Y ~ X
#> Model 2: Y ~ group + X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 28 1307774
#> 2 26 1244265 2 63509 0.6635 0.5235
anova(m_sans_X, m) # Same SS as the type 2 test for X
#> Analysis of Variance Table
#>
#> Model 1: Y ~ group
#> Model 2: Y ~ group + X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 27 1624737
#> 2 26 1244265 1 380471 7.9503 0.009077 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Because the order of terms is usually of little importance, type 1 tests are rarely used in practice…
Unfortunately, they are R’s default…
Adding Interactions
Things get a bit more complicated when interactions are involved, as type 2 SS treat interactions differently than main effects:
<- lm(Y ~ group * X, data = d) m_int
Each main effect term is attributed with variance (its SS) that is unique to it and that is not attributable to any of the other main effects (simultaneously, as we’ve already seen) but without accounting for the variance attributable to interactions, while the SS of the interaction term represents its unique variance after accounting for the underlying main effects (sequentially). So we get the unique contribution of each main effect when controlling only for the other main effects, and the unique contribution of the interactions controlling for the already-included combined contribution of the main effects.
::Anova(m_int, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> group 63509 2 1.3555 0.2768607
#> X 380471 1 16.2410 0.0004884 ***
#> group:X 682026 2 14.5566 7.246e-05 ***
#> Residuals 562240 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can again recreate this type 2 ANOVA table with model comparisons3:
anova(m_sans_group, m) # Same SS as the type 2 test for group
#> Analysis of Variance Table
#>
#> Model 1: Y ~ X
#> Model 2: Y ~ group + X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 28 1307774
#> 2 26 1244265 2 63509 0.6635 0.5235
anova(m_sans_X, m) # Same SS as the type 2 test for X
#> Analysis of Variance Table
#>
#> Model 1: Y ~ group
#> Model 2: Y ~ group + X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 27 1624737
#> 2 26 1244265 1 380471 7.9503 0.009077 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m, m_int) # Same SS as the type 2 test for group:X
#> Analysis of Variance Table
#>
#> Model 1: Y ~ group + X
#> Model 2: Y ~ group * X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 26 1244265
#> 2 24 562240 2 682026 14.557 7.246e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In designs of higher order, each “order” is tested in a similar simultaneous-sequential manner. E.g., in a 3-way design, all main effects (1st order) are tested simultaneously (accounting for one another), then all 2-way interactions (2nd order) are tested simultaneously (accounting for the main effects and one another), and then the 3-way interaction is tested (accounting for all main effects and 2-way interactions).
Type 3
There is another type of simultaneous SS - the type 3 test, which treats interactions and main effects equally: the SS for each main effect or interaction is calculated as its unique contribution that is not attributable to any of the other effects in the model - main effects or interactions. So the effect of X
is its unique contribution while controlling both for group
and for group:X
!
However, remember how we previously saw that these methods in R actually produce omnibus tests for the combined effect of the parameters of each term. But in the m_int
model the parameters labeled X
, groupGb
, and groupGc
no longer represent parameters of the main effects - instead they are parameters of simple (i.e., conditional) effects!
summary(m_int)
#>
#> Call:
#> lm(formula = Y ~ group * X, data = d)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -370.18 -67.15 33.68 111.35 172.14
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 833.130 173.749 4.795 6.99e-05 ***
#> groupGb -1308.898 233.218 -5.612 8.90e-06 ***
#> groupGc -752.718 307.747 -2.446 0.0222 *
#> X -4.041 1.942 -2.081 0.0482 *
#> groupGb:X 13.041 2.419 5.390 1.55e-05 ***
#> groupGc:X 7.731 3.178 2.432 0.0228 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 153.1 on 24 degrees of freedom
#> Multiple R-squared: 0.6558, Adjusted R-squared: 0.5841
#> F-statistic: 9.146 on 5 and 24 DF, p-value: 5.652e-05
X
is the slope ofX
whengroup=Ga
(When both dummy variables are fixed at 0, asGa
is the reference level)
groupGb
is the difference betweengroup=Ga
andgroup=Gb
whenX=0
groupGc
is the difference betweengroup=Ga
andgroup=Gc
whenX=0
(Pay attention to the “when” - this is what makes them conditional.)
We can see that changing the reference group changes the test for X
:
$group <- relevel(d$group, ref = "Gb")
d<- lm(Y ~ group * X, data = d)
m_int2
::Anova(m_int, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 538630 1 22.9922 6.994e-05 ***
#> group 738108 2 15.7536 4.269e-05 ***
#> X 101495 1 4.3325 0.04823 *
#> group:X 682026 2 14.5566 7.246e-05 ***
#> Residuals 562240 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
::Anova(m_int2, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 219106 1 9.3528 0.005402 **
#> group 738108 2 15.7536 4.269e-05 ***
#> X 910646 1 38.8722 1.918e-06 ***
#> group:X 682026 2 14.5566 7.246e-05 ***
#> Residuals 562240 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How can we resolve this?
Centering
By centering our predictors! Centering is transforming our data in such a way that 0 represents the overall mean. When this is done, conditioning on 0 is the same as conditioning on the overall mean = looking at the main effect!
For covariables this is easy enough:
$X_c <- d$X - mean(d$X) # or scale(d$X, center = TRUE, scale = FALSE) d
But how do we center a factor??
The answer is - use some type of orthogonal coding, for example contr.sum()
(effects coding). This makes the coefficients harder to interpret 4, but we’re not looking at those anyway!
contrasts(d$group) <- contr.sum
Now when looking at type 3 tests, the main effects terms actually are main effects!
<- lm(Y ~ group*X_c, data = d)
m_int3 ::Anova(m_int3, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 4743668 1 202.4902 3.401e-13 ***
#> group 19640 2 0.4192 0.66231
#> X_c 143772 1 6.1371 0.02067 *
#> group:X_c 682026 2 14.5566 7.246e-05 ***
#> Residuals 562240 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Remember: ABC - Always Be Centering (your predictors) - type 3 ANOVA tables make little sense without centering5!
Recreate the Type 3 ANOVA
Unfortunately, we can’t just build a model without an interaction term and use it to recreate the type 3 ANOVA. Instead, we need to actually build the model matrix (i.e., the design matrix), and drop the columns of each term in turn:
<- model.matrix(m_int2)
mm head(mm)
#> (Intercept) groupGa groupGc X groupGa:X groupGc:X
#> 1 1 0 0 102 0 0
#> 2 1 1 0 52 52 0
#> 3 1 0 0 134 0 0
#> 4 1 0 0 128 0 0
#> 5 1 1 0 78 78 0
#> 6 1 0 0 150 0 0
A type 3 test for a term, is equal to the comparison between a model without the parameters associated with that term and the full model:
<- lm(Y ~ mm[,-(2:3)], data = d)
m_sans_group <- lm(Y ~ mm[,-4], data = d)
m_sans_X <- lm(Y ~ mm[,-(5:6)], data = d)
m_sans_int
anova(m_sans_group, m_int2) # Same SS as type 3 for group
#> Analysis of Variance Table
#>
#> Model 1: Y ~ mm[, -(2:3)]
#> Model 2: Y ~ group * X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 26 1300348
#> 2 24 562240 2 738108 15.754 4.269e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m_sans_X, m_int2) # Same SS as type 3 for X
#> Analysis of Variance Table
#>
#> Model 1: Y ~ mm[, -4]
#> Model 2: Y ~ group * X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 25 1472886
#> 2 24 562240 1 910646 38.872 1.918e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m_sans_int, m_int2) # Same SS as type 3 for group:X
#> Analysis of Variance Table
#>
#> Model 1: Y ~ mm[, -(5:6)]
#> Model 2: Y ~ group * X
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 26 1244265
#> 2 24 562240 2 682026 14.557 7.246e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Compare to:
::Anova(m_int2, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 219106 1 9.3528 0.005402 **
#> group 738108 2 15.7536 4.269e-05 ***
#> X 910646 1 38.8722 1.918e-06 ***
#> group:X 682026 2 14.5566 7.246e-05 ***
#> Residuals 562240 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type 2 vs Type 3
As mentioned above, the distinction between types 2 and 3 comes from how they estimate main effects in the presence of interactions.
Let’s look at the following factorial design:
<- lm(Y ~ condition * group, data = d,
m_factorial # Another way to specify effects coding:
contrasts = list(condition = contr.sum,
group = contr.sum))
::Anova(m_factorial, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> condition 12048 1 0.1840 0.6718
#> group 7165 2 0.0547 0.9469
#> condition:group 41204 2 0.3146 0.7330
#> Residuals 1571485 24
::Anova(m_factorial, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 5858845 1 89.4773 1.439e-09 ***
#> condition 3452 1 0.0527 0.8203
#> group 8913 2 0.0681 0.9344
#> condition:group 41204 2 0.3146 0.7330
#> Residuals 1571485 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
But where do these differences between type 2 and 3 come from?
Type 2 SS looks at the SS between the means of A
, across the levels of B
. So the marginal mean of the first group is estimated as:
Type 3 SS however looks at the SS between the means of group
, weighted by condition
. So the marginal mean of group a
is estimated as:
This makes type 3 SS invariant to the cell frequencies!
But as we will soon see, this need not always be the case…
A lot has been said about type 2 vs type 3. I will not go into the weeds here, but it is important to note that
- Most statistical softwares (SAS, Stata, SPSS, …) default to type 3 SS with orthogonal factor coding (but covariables are not mean-centered in most cases by default) (see Langsrud, 2003). This makes
R
inconsistent as we’ve seen it defaults to type 1 ANOVA and treatment coding.
- Often in factorial designs, any imbalance in the design is incidental, so it is often beneficial to have a method that is invariant to such imbalances. (Though this may not be true if the data is observational.)
- Coefficient tables give results that are analogous to type 3 SS when all terms are covariables:
<- lm(Y ~ X * Z, data = d)
m_covs
::Anova(m_covs, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> X 324676 1 6.7657 0.01513 *
#> Z 2430 1 0.0506 0.82373
#> X:Z 57640 1 1.2011 0.28315
#> Residuals 1247704 26
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
::Anova(m_covs, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 83130 1 1.7323 0.1996
#> X 5866 1 0.1222 0.7294
#> Z 59917 1 1.2486 0.2740
#> X:Z 57640 1 1.2011 0.2831
#> Residuals 1247704 26
summary(m_covs) # same p-values as type 3
#>
#> Call:
#> lm(formula = Y ~ X * Z, data = d)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -384.97 -153.72 -23.98 141.28 422.79
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 366.352 278.348 1.316 0.200
#> X 1.014 2.900 0.350 0.729
#> Z -112.681 100.843 -1.117 0.274
#> X:Z 1.175 1.072 1.096 0.283
#>
#> Residual standard error: 219.1 on 26 degrees of freedom
#> Multiple R-squared: 0.2362, Adjusted R-squared: 0.1481
#> F-statistic: 2.68 on 3 and 26 DF, p-value: 0.06772
Note once again that the model being summarized here is not factorial at all - both Z
and X
are continuous covariables!
Balanced vs. Unbalanced Data
The distinction between types 1, 2 and 3 SS is only relevant when there is some dependency between predictors (aka some collinearity). In our example, we can see that group
and X
are somewhat co-linear (VIF / tolerance are not strictly 1):
::check_collinearity(m) performance
#> # Check for Multicollinearity
#>
#> Low Correlation
#>
#> Term VIF VIF 95% CI Increased SE Tolerance Tolerance 95% CI
#> group 1.08 [1.00, 5.73] 1.04 0.92 [0.17, 1.00]
#> X 1.08 [1.00, 5.73] 1.04 0.92 [0.17, 1.00]
In a factorial design, we might call this dependence / collinearity among our predictors an “unbalanced design” (the number of observations differs between cells), and when the predictors are completely independent we would call this a “balanced design” (equal number of observations in all cells).
Let’s look at two examples:
Balanced data
We can see that Rx
and condition
are balanced:
table(d$Rx, d$condition)
#>
#> Ca Cb
#> Placebo 5 5
#> Dose100 5 5
#> Dose250 5 5
chisq.test(d$Rx, d$condition)$statistic # Chisq is exactly 0
#> X-squared
#> 0
And so type 1, 2 and 3 ANOVA tables are identical:
contrasts(d$Rx) <- contr.sum
contrasts(d$condition) <- contr.sum
<- lm(Y ~ condition * Rx, data = d)
m_balanced
anova(m_balanced)
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> condition 1 13666 13666 0.2162 0.6461
#> Rx 2 41379 20690 0.3273 0.7240
#> condition:Rx 2 61444 30722 0.4860 0.6210
#> Residuals 24 1517030 63210
::Anova(m_balanced, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> condition 13666 1 0.2162 0.6461
#> Rx 41379 2 0.3273 0.7240
#> condition:Rx 61444 2 0.4860 0.6210
#> Residuals 1517030 24
::Anova(m_balanced, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 6422803 1 101.6112 4.204e-10 ***
#> condition 13666 1 0.2162 0.6461
#> Rx 41379 2 0.3273 0.7240
#> condition:Rx 61444 2 0.4860 0.6210
#> Residuals 1517030 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Unbalanced data
However, condition
and group
are NOT balanced:
table(d$group, d$condition)
#>
#> Ca Cb
#> Gb 6 6
#> Ga 5 6
#> Gc 4 3
chisq.test(d$group, d$condition)$statistic # Chisq is NOT 0
#> Warning in chisq.test(d$group, d$condition): Chi-squared approximation may be
#> incorrect
#> X-squared
#> 0.2337662
And so type 1, 2 and type 3 ANOVA tables are NOT identical (recall how type 2 and 3 estimate marginal means differently in the presence of interactions):
anova(m_factorial)
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> condition 1 13666 13666 0.2087 0.6519
#> group 2 7165 3582 0.0547 0.9469
#> condition:group 2 41204 20602 0.3146 0.7330
#> Residuals 24 1571485 65479
::Anova(m_factorial, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> condition 12048 1 0.1840 0.6718
#> group 7165 2 0.0547 0.9469
#> condition:group 41204 2 0.3146 0.7330
#> Residuals 1571485 24
::Anova(m_factorial, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 5858845 1 89.4773 1.439e-09 ***
#> condition 3452 1 0.0527 0.8203
#> group 8913 2 0.0681 0.9344
#> condition:group 41204 2 0.3146 0.7330
#> Residuals 1571485 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As the deviation from perfect balance (i.e. independence) is larger, so will the differences between the types increase.
Why Does This Happen?
We’ve seen that types 1, 2 and 3 all attribute variance in \(Y\) to the model’s terms by partialling out the model’s other terms in some way - sequentially, simultaneously, or some mix of both.
However, if the predictors in the model are independent (such as in a balanced design, zero collinearity), then regardless of the order of their inclusion in the model, all of the variance that can be attributable to some term A is unique to A - none of it is also attributable to any of the other term in the model, and vice versa - there is nothing to partial out, so the order does not matter!
This also means that because the SS returned by both type 2 and 3 ANOVA tables represent the terms’ uniquely attributable variation in \(Y\), then when the design is not balanced / there is collinearity in the data, the SS in the ANOVA table will not sum to the total SS - as there is some overlap (some non-unique variation) that is not represented in the ANOVA table. However… this is where type 1 ANOVA tables shine, as their sequential nature means the SS in the ANOVA table will sum to the total SS!
Let’s look at an extreme example of collinearity:
Collinear data
<- MASS::mvrnorm(
d2 n = 100,
mu = rep(0, 3),
Sigma = matrix(c(1, 0.99, 0.4,
0.99, 1, 0.41,
0.4, 0.41, 1), nrow = 3)
)
<- data.frame(d2)
d2 colnames(d2) <- c("X", "Z", "Y")
<- lm(Y ~ X + Z, data = d2)
m_collinear
::check_collinearity(m_collinear) performance
#> # Check for Multicollinearity
#>
#> High Correlation
#>
#> Term VIF VIF 95% CI Increased SE Tolerance Tolerance 95% CI
#> X 47.02 [32.64, 67.95] 6.86 0.02 [0.01, 0.03]
#> Z 47.02 [32.64, 67.95] 6.86 0.02 [0.01, 0.03]
Looking a type 1 ANOVA table we can see that X
accounts for a significant amount of variation in Y
, but that Z
does not add anything significant on top of X
:
anova(m_collinear)
#> Analysis of Variance Table
#>
#> Response: Y
#> Df Sum Sq Mean Sq F value Pr(>F)
#> X 1 16.927 16.9265 22.664 6.743e-06 ***
#> Z 1 0.000 0.0000 0.000 0.9999
#> Residuals 97 72.444 0.7468
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
However, were we to look at a type 2 ANOVA table, we might get the impression that neither X
nor Z
contribute to the model:
::Anova(m_collinear, type = 2) car
#> Anova Table (Type II tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> X 0.360 1 0.4822 0.4891
#> Z 0.000 1 0.0000 0.9999
#> Residuals 72.444 97
This demonstrates the importance of always interpreting type 2 and 3 ANOVA tables in light of any collinearity that might exist between your predictors; Remember: ABC - Always Be mindful of Collinearity (okay, that one was a bit of a stretch).
ANOVA Made Easy
We can use the lm()
-> car::Anova()
method to conduct a proper ANOVA on a maximal factorial design. However, making sure that our factors are orthogonally coded is a pain in the @$$.
$group <- factor(d$group)
d$condition <- factor(d$condition)
dcontrasts(d$group) <- contr.sum
contrasts(d$condition) <- contr.sum
<- lm(Y ~ group * condition, d)
m_lm ::Anova(m_lm, type = 3) car
#> Anova Table (Type III tests)
#>
#> Response: Y
#> Sum Sq Df F value Pr(>F)
#> (Intercept) 5858845 1 89.4773 1.439e-09 ***
#> group 8913 2 0.0681 0.9344
#> condition 3452 1 0.0527 0.8203
#> group:condition 41204 2 0.3146 0.7330
#> Residuals 1571485 24
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Thankfully, we have the afex
package which turns all of that mess into something much more palatable:
::aov_car(Y ~ group * condition + Error(id), data = d) afex
#> Anova Table (Type 3 tests)
#>
#> Response: Y
#> Effect df MSE F ges p.value
#> 1 group 2, 24 65478.53 0.07 .006 .934
#> 2 condition 1, 24 65478.53 0.05 .002 .820
#> 3 group:condition 2, 24 65478.53 0.31 .026 .733
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Much easier!
Other Types of Models
So far we’ve seen how ANOVA tables are applied to non-factorial linear OLS models. However the idea of omnibus tests per-term can be extended to many other types of models. For models where SS cannot be calculated, analogous methods based on deviance or likelihood are used instead (read more in the car::Anova()
docs). Here are some examples:
GLMs
Logistic
<- glm(condition ~ group * X_c, data = d,
m_logistic family = binomial())
::Anova(m_logistic, type = 2) car
#> Analysis of Deviance Table (Type II tests)
#>
#> Response: condition
#> LR Chisq Df Pr(>Chisq)
#> group 0.15679 2 0.9246
#> X_c 0.75134 1 0.3861
#> group:X_c 1.10225 2 0.5763
Poisson
<- glm(Z ~ group * X_c, data = d,
m_poisson family = poisson())
::Anova(m_poisson, type = 3) car
#> Analysis of Deviance Table (Type III tests)
#>
#> Response: Z
#> LR Chisq Df Pr(>Chisq)
#> group 0.88074 2 0.6438
#> X_c 0.04370 1 0.8344
#> group:X_c 0.11357 2 0.9448
Ordinal
<- ordinal::clm(group ~ X_c * condition, data = d)
m_ordinal anova(m_ordinal)
#> Type I Analysis of Deviance Table with Wald chi-square tests
#>
#> Df Chisq Pr(>Chisq)
#> X_c 1 0.4469 0.5038
#> condition 1 0.1584 0.6907
#> X_c:condition 1 0.6129 0.4337
Note that all of the models summarized here with an ANOVA table do not have a continuous (conditionally normal) outcome, and not all of their predictors are categorical. We may have been inclined to summarize these models with a coefficient table, but it is equally valid to present an ANOVA table!
(G)LMMs
<- lmerTest::lmer(Y ~ X_c * group + (group | Z), data = d)
m_mixed anova(m_mixed, type = 2)
#> Type II Analysis of Variance Table with Satterthwaite's method
#> Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
#> X_c 290878 290878 1 23.645 14.4210 0.0008951 ***
#> group 7198 3599 2 9.520 0.1784 0.8393243
#> X_c:group 582071 291036 2 21.938 14.4288 0.0001001 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Also for GLMMs:
<- lme4::glmer(condition ~ X_c * group + (1 | Z), data = d,
m_mixed2 family = binomial())
::Anova(m_mixed2, type = 3) car
#> Analysis of Deviance Table (Type III Wald chisquare tests)
#>
#> Response: condition
#> Chisq Df Pr(>Chisq)
#> (Intercept) 0.0886 1 0.7660
#> X_c 1.1917 1 0.2750
#> group 0.0829 2 0.9594
#> X_c:group 0.9972 2 0.6074
Concluding Remarks
Although we might be more inclined to summarize our model with an ANOVA table when our model contains categorical predictors, especially when interactions are involved, we’ve seen that ANOVA tables do not require any special data (factorial, balanced, normal outcome), and can be used as an alternative to a coefficient table.
Regardless of how we summarize our model - with a coefficient table or with an ANOVA table, using type 1, 2 or 3 SS, with orthogonal or treatment coding, with centered or uncentered covariables - our underlying model is equivalent - and will produce the same estimated simple effects, marginal means and contrasts. That is, the method we use to summarize our model will not have any bearing on whatever follow-up analysis we may wish to carry out (using emmeans
of course! Check out the materials from my R course: Analysis of Factorial Designs).6
I hope you now have a fuller grasp of what goes on behind the scenes when producing ANOVA tables, how the different types of ANOVA tables work, when they should be used, and how to interpret their results. ANOVA tables are a powerful tool that can be applied not only to factorial data coupled with an OLS model, but also to a wide variety of (generalized) linear (mixed) regression models.