14.2: Sources of variation

Last updated
Save as PDF

Page ID: 45227

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Introduction

Sources of variation, or components of the two-way ANOVA, include two factors, each with two or more levels (groups); collectively, factors are often referred to as the main effects in these types of ANOVA. The other source of variation in a two-way ANOVA is the interaction between the two factors. Below, I have listed the important components, although I have not included how the sum of squares are calculated. You are expected to know the sources of variation for this most basic two-way ANOVA table (Table \(\PageIndex{1}\)). You should also be able to solve any missing elements in one of these tables by utilizing any included information.

Table \(\PageIndex{1}\). ANOVA table for two-way, balanced, replicated design.
Source of variation	Degrees of Freedom	Mean Squares	\(F\)-statistic
Factor A (Diet)	\(a-1\)	\(\frac{factor \ A \ SS}{factor \ A \ DF}\)	\(\frac{factor \ A \ MS}{error MS}\)
Factor B (Population)	\(b-1\)	\(\frac{factor \ B \ SS}{factor \ B \ DF}\)	\(\frac{factor \ B \ MS}{error MS}\)
\(A \times B\) interaction	\((a-1)(b-1)\)	\(\frac{A \times B \ SS}{A \times B \ DF}\)	\(\frac{A \times B \ MS}{error MS}\)
Error (within-cells)	\(ab(n-1)\)	\(\frac{error \ SS}{error \ DF}\)
Total	\(N-1\)

Taking each row from Table \(\PageIndex{1}\) one at a time, we have:

Source of variation	Degrees of Freedom	Mean Squares	\(F\)-statistic
First Factor	\(a-1\)	\(\frac{factor \ A \ SS}{factor \ A \ DF}\)	\(\frac{factor \ A \ MS}{error MS}\)

where Source refers to the source of variation, \(DF\) refers to Degrees of Freedom, \(a\) is the number of levels (groups) of the first factor, \(SS\) refers to Sum of Squares, and \(MS\) refers to the Mean Squares.

Next is the second factor

Source of variation	Degrees of Freedom	Mean Squares	\(F\)-statistic
Second Factor	\(b-1\)	\(\frac{factor \ B \ SS}{factor \ B \ DF}\)	\(\frac{factor \ B \ MS}{error MS}\)

where \(b\) is the number of levels (groups) of the second factor. Next is the interaction between the first and second factors.

Source of variation	Degrees of Freedom	Mean Squares	\(F\)-statistic
Interaction	\((a-1)(b-1)\)	\(\frac{A \times B \ SS}{A \times B \ DF}\)	\(\frac{A \times B \ MS}{error MS}\)

and lastly the Within-cell Error or residual source of variation

Source of variation	Degrees of Freedom	Mean Squares	\(F\)-statistic
Error (within-cells)	\(ab(n-1)\)	\(\frac{error \ SS}{error \ DF}\)	N/A

where \(n\) is the number of experimental units for each group. Note that if the sample size differs for one or more groups (levels),then the design would be unbalanced and this formula does not work to determine the degrees of freedom. The total degrees of freedom for the two-way ANOVA is simply \(N - 1\), where \(N\) is the sample size for the entire problem; a little algebra shows that \(N\) may be calculated as \(N = abn\)

Unbalanced designs

An unbalanced design implies that observations are missing value for one or more groups. What to do if data are missing? The decision depends on how the data are missing (see Chapter 5). For example, if data are missing at random with respect to treatment, then this should not affect inference. If data are missing not at random, then inference, logically, must be impacted. Calculating the ANOVA, moreover, becomes a different matter. In the one-way ANOVA, no real problem arises although setting up contrasts among the levels requires a weighting term to be factored into the calculations. For higher-level ANOVA involving two or more factors, the sums of squares for treatment effects are no longer simple partitioning into the different sources of variation. The sources overlap and the order by which the Factors enter into the statistical model now affects the calculations. Thus, while setting up the calculations for the balanced design is straightforward, perhaps surprisingly, if group sizes differ, this simple relationship for calculating the degrees of freedom, sums of squares, and Mean squares becomes an unsolvable problem. This problem is largely solved by the general linear model.

Questions

In two-way ANOVA, what should you always test first?
A. The significance of Factor 1.
B. The significance of Factor 2.
C. The significance of the interaction between Factor 1 and Factor 2.
D. Doesn’t matter which is tested first because you have three null hypotheses in the 2-way ANOVA.
Why is the cell empty for \(F\) statistic in the Within-cell Error or residual source of variation?
Based on the results of a two-way ANOVA, the error sums of squares \((SSE)\) was computed to be 160. If we ignore one of the factors and perform a one-way ANOVA using the same data, will the \(SSE\) be the same as in the two-way ANOVA, or will it increase? Decrease? Explain your choice.
While conducting a two-way ANOVA, you conclude that a statistically significant interaction exists between factor 1 and factor 2. What should be your next step? Do you drop the interaction term from the model and redo the analysis or do you report the results of factor effects including the non-significant interaction?

Search

Text Color

Text Size

Margin Size

Font Type