Two-Factor ANOVA model with n = 1 (no replication)
- Page ID
- 248
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)1. Two-factor ANOVA model with n = 1 (no replication)
- For some studies, there is only one replicate per treatment, i.e., n = 1.
- ANOVA model for two-factor studies need to be modified, since
- the degrees of freedom associated with \(SSE\) will be \((n - 1)ab = 0\);
- thus the error variance \(\sigma^2\) can not be estimated by \(SSE\) anymore. - Idea: make the model simpler by assuming the two factors do not interact with each other. Validity of this assumption needs to be checked.
1.1 Two-factor model without interaction
With n = 1.
- Model equation:
\[Y_{ij} = \mu_{..} + \alpha_i + \beta_j + \epsilon_{ij}, i = 1, ..., a, j = 1, ..., b.\]
- Identifiability constraints:
\[\sum_{i=1}^{a}\alpha_i = 0, \sum_{j=1}^{b}\beta_j = 0.\]
- Distributional assumptions: \(\epsilon_{ij}\) are i.i.d. \(N(0,\sigma^2)\)
Sum of squares
Interaction sum of squares now plays the role of error sum of squares.
\[SSAB = n\sum_{i=1}^{a} \sum_{j=1}^{b}(\overline{Y}_{ij.} - \overline{Y}_{i..} - \overline{Y}_{.j.} + \overline{Y}_{...})^2 = \sum_{i=1}^{a} \sum_{j=1}^{b}(\overline{Y}_{ij} - \overline{Y}_{i.} - \overline{Y}_{.j} + \overline{Y}_{..})^2 \]
\(MSAB = \frac{SSAB}{(a-1)(b-1)}\) since \(d.f.(SSAB) = (a-1)(b-1)\).
- In the general two-factor ANOVA model (when n = 1),
\[E(MSAB) = \sigma^2 + \frac{\sum_{i=1}^{a}\sum_{j=1}^{b} (\alpha\beta)^2_{ij}}{(a-1)(b-1)}\]
- Under the model without interaction: \(E(MSAB) = \sigma^2\)
- Thus \(MSAB\) can be used to estimate \(\sigma^2\).
ANOVA Table
ANOVA table for two-factor model without interaction and \(n=1\)
Source of Variation | SS | df | MS |
Factor A | \(SSA = b\sum_i(\overline{Y}_{i.} - \overline{Y}_{..})^2\) | \(a - 1\) | \(MSA\) |
Factor B | \(SSB = a\sum_j(\overline{Y}_{.j} - \overline{Y}_{..})^2\) | \(b - 1\) | \(MSB\) |
Error | \(SSAB = \sum_{i=1}^{a}\sum_{j=1}^{b}(\overline{Y}_{ij} - \overline{Y}_{i.} - \overline{Y}_{.j} + \overline{Y}_{..})^2\) | \((a - 1)(b - 1)\) | \(MSAB\) |
Total | \(SSTO = \sum_{i=1}^{a}\sum_{j=1}^{b}(\overline{Y}_{ij} - \overline{Y}_{..})^2\) | \(ab - 1\) |
Expected mean squares (under no interaction):
\[E(MSA) = \sigma^2 + \frac{b\sum_{i=1}^{a}\alpha_i^2}{a - 1}, E(MSB) = \sigma^2 + \frac{a\sum_{j=1}^{b}\beta_j^2}{b - 1}, E(MSAB) = \sigma^2\]
F tests (for main effects)
Test factor A main effects: \(H_o: \alpha_1 = ... = \alpha_a = 0\) vs. \(H_a:\) not all \(\alpha_i\)'s are equal to zero.
- \(F_A^* = \frac{MSA}{MSAB} ~ F_{a - 1, (a - 1)(b - 1)}\) under \(H_o\).
- Reject \(H_o\) at level of significance \(\alpha\) if observed \(F_A^* > F(1 - \alpha; a - 1, (a - 1)(b - 1))\).
Test factor B main effects: \(H_o: \beta_1 = ... = \beta_b = 0\) vs. \(H_a:\) not all \(\beta_j\)'s are equal to zero.
- \(F_B^* = \frac{MSB}{MSAB} ~ F_{b - 1, (a - 1)(b - 1)}\) under \(H_o\).
- Reject \(H_o\) at level of significance \(\alpha\) if observed \(F_B^* > F(1 - \alpha; b - 1, (a - 1)(b - 1))\).
Estimation of means
Estimation of factor level means \(\mu_{i.}\)'s , \(\mu_{.j}\)'s.
- Proceed as before, viz., use the unbiased estimator \(\overline{Y}_{i.}\) for \(\mu_{i.}\) and \(\overline{Y}_{.j}\) for \(\mu_{.j}\), but replace \(MSE\) by \(MSAB\) and use the degrees of freedom of \(MSAB\), that is \((a - 1)(b - 1)\). Thus, estimated standard errors:
\[s(\overline{Y}_{i.}) = \sqrt{\frac{MSAB}{b}}, s(\overline{Y}_{.j}) = \sqrt{\frac{MSAB}{a}}.\]
Estimation of treatment means \(\mu_{ij}\)'s.
- \(\mu_{ij} = E(Y_{ij}) = \mu_{..} + \alpha_i + \beta_j = \mu_{i.} + \mu_{.j} - \mu_{..}\)
Thus, an unbiased estimator: \(\widehat{\mu}_{ij} = \overline{Y}_{i.} + \overline{Y}_{.j} - \overline{Y}_{..}\)
Estimated standard error:
\[s(\widehat{\mu}_{ij}) = \sqrt{MSAB(\frac{1}{b} + \frac{1}{a} - \frac{1}{ab})} = \sqrt{MSAB(\frac{a + b - 1}{ab})}\]
1.2 Example: Insurance
An analyst studied the premium for auto insurance charged by an insurance company in six cities. The six cities were selected to represent different sizes (Factor A: small, medium, large) and differentregions of the state (Factor B: east, west). There is only one city for each combination of size and region. The amounts of premiums charged for a specific type of coverage in a given risk category for each of the six cities are given in the following table.
Table 1: Numbers in parentheses are \(\widehat{\mu}_{ij} = \overline{Y}_{i.} + \overline{Y}_{.j} - \overline{Y}_{..}\)
Factor B | |||
East | West | ||
Factor A | Small 140(135) Medium 210(210) Large 220(225) | 100(105) 180(180) 200(195) | \(\overline{Y}_{1.} = 120\) \(\overline{Y}_{2.} = 195\) \(\overline{Y}_{3.} = 210\) |
\(\overline{Y}_{.1} = 190\) | \(\overline{Y}_{.2} = 160\) | \(\overline{Y}_{..} = 175\) |
Interaction plot based on the treatment sample means \(Y_{ij}\)'s: no strong interactions.
Sum of squares:
- Here \(a = 3\), \(b = 2\), \(n = 1\).
- \(SSA = 2[(120 - 175)^2 + (195 - 175)^2 + (210 - 175)^2] = 9300.\).
- \(SSB = 3[(190 - 175)^2 + (160 - 175)^2] = 1350\).
- \(SSAB = (140 - 120 - 190 + 175)^2 + ... + (200 - 210 - 160 + 175)^2 = 100\).
- \(SSTO = SSA + SSB + SSAB = 10750\).
Hypothesis testing: - Test \(H_o: \mu_{1.} = \mu_{2.} = \mu_{3.}\) (equivalently, \(H_o: \alpha_1 = \alpha_2 = \alpha_3 = 0\)) at level 0.05.
Table 2: ANOVA Table for Insurance example Source of Variation SS df MS Factor A \(SSA = 9300\) \(a - 1 = 2\) \(MSA = 4650\) Factor B \(SSB = 1350\) \(b - 1 = 1\) \(MSB = 1350\) Error \(SSAB = 100\) \((a - 1)(b - 1) = 2\) \(MSAB = 50\) Total \(SSTO = 10750\) \(ab - 1 = 5\)
\(F_A^* = \frac{MSA}{MSAB} = \frac{4650}{50} = 93\) and \(F(0.95; 2, 2) = 19\). Thus reject \(H_o\) at level 0.05.
- Estimation of \(\mu_{ij}\): e.g.,
\(\widehat{\mu}_{11} = \overline{Y}_{1.} + \overline{Y}_{.1} - \overline{Y}_{..} = 120 + 190 - 175 = 135\). - Estimation of \(\mu_{i.}\) and \(\mu_{.j}\): e.g.,
\(\widehat{\mu}_{1.} = \overline{Y}_{1.} = 120\).
\(s(\overline{Y}_{1.}) = \sqrt{\frac{MSAB}{b}} = \sqrt{\frac{50}{2}} = 5\).
The 95% C.I. for \(\mu_{1.}\) is:
\[\overline{Y}_{1.} \pm t(0.975; 2) * s(\overline{Y}_{1.}) = 120 \pm 4.3*5 = (98.5, 141.5).\]
1.3 Checking for the presence of interaction: Tukey's test for additivity
For a two-factor study with \(n = 1\), decide whether or not the two factors are interacting.
- In the no-interaction model, we assume that all \((\alpha\beta)_{ij} = 0\).
- Idea: use a less severe restriction on the interaction effects, by assuming
\[(\alpha\beta)_{ij} = D\alpha_i\beta_j, i = 1, ... , a, j = 1, ... , b,\]
where \(D\) is an unknown parameter. - The model becomes:
\[Y_{ij} = \mu_{..} + \alpha_i + \beta_j + D\alpha_i\beta_j + \epsilon_{ij}, i = 1, ... , a, j = 1, ..., b,\]
under the constraints that
\[\sum_{i = 1}^{a}\alpha_i = \sum_{j = 1}^{b}\beta_j = 0.\]
Estimation of \(D\)
- Multiply \(\alpha_i\beta_j\) on both sides of the equation:
\[\alpha_i\beta_jY_{ij} = \mu_{..}\alpha_i\beta_j + \alpha_i^2\beta_j + \alpha_i\beta_j^2 + D\alpha_i^2\beta_j^2 + \epsilon_{ij}\alpha_i\beta_j\] - Sum over all pairs (i, j):
\[\sum_{i=1}^{a}\sum_{j=1}^{b}\alpha_i\beta_jY_{ij}=D\sum_{i=1}^{a}\sum_{j=1}^{b}\alpha_i^2\beta_j^2 + \sum_{i=1}^{a}\sum_{j=1}^{b}\epsilon_{ij}\alpha_i\beta_j\] - Then
\[\widetilde{D} := \frac{\sum_{i=1}^{a}\sum_{j=1}^{b}\alpha_i\beta_jY_{ij}}{(\sum_{i=1}^{a}\alpha_i^2)(\sum_{j=1}^{b}\beta_j^2)} \approx D\] - We have the following estimates:
\[\widehat{\alpha}_i = \overline{Y}_{i.} - \overline{Y}_{..}, \widehat{\beta}_j = \overline{Y}_{.j} - \overline{Y}_{..}\] - Thus, an estimator of \(D\) (which is also the least squares and the maximum likelihood estimator) is given by
\[\widehat{D} = \frac{\sum_{i=1}^{a}\sum_{j=1}^b ( \overline{Y}_{i.} - \overline{Y}_{..})( \overline{Y}_{.j} - \overline{Y}_{..})Y_{ij}}{(\sum_{i=1}^{a}( \overline{Y}_{i.} - \overline{Y}_{..})^2)(\sum_{j=1}^{b}( \overline{Y}_{.j} - \overline{Y}_{..})^2)}.\]
ANOVA decomposition
\[SSTO = SSA + SSB + SSAB* + SSRem*.\]
- Interaction sum of squares
\[SSAB* = \sum_{i=1}^{a}\sum_{j=1}^{b}\widehat{D}^2\widehat{\alpha}_i^2\widehat{\beta}_j^2 = \frac{(\sum_{i=1}^{a}\sum_{j=1}^b ( \overline{Y}_{i.} - \overline{Y}_{..})( \overline{Y}_{.j} - \overline{Y}_{..})Y_{ij})^2}{(\sum_{i=1}^{a}( \overline{Y}_{i.} - \overline{Y}_{..})^2)(\sum_{j=1}^{b}( \overline{Y}_{.j} - \overline{Y}_{..})^2)}\] - Remainder sum of squares
\[SSREM^* = SSTO - SSA - SSB - SSAB^*\] - Decomposition of degrees of freedom
\[df(SSTO) = df(SSA) + df(SSB) + df(SSAB^*) + df(SSRem^*)\]
\[ab - 1 = (a - 1) + (b - 1) + 1 + (ab - a - b)\] - Tukey's one degree of freedom test for additivity: \(H_o: D = 0\) (i.e., no interaction) vs. \(H_a: D \neq 0\).
- \(F\) ratio \(F_{Tukey}^{*} = \frac{SSAB^*/1}{SSRem^*/(ab - a - b)}\sim F_{1, ab - a - b}\) under \(H_o\).
- Decision rule: reject \(H_o: D = 0\) at level of significance \(\alpha\) if \(F_{Tukey}^{*} > F(1 - \alpha; 1, ab - a - b)\).
Example: Insurance
- \(\sum_{ij}(\overline{Y}_{i.} - \overline{Y}_{..})( \overline{Y}_{.j} - \overline{Y}_{..})Y_{ij} = -13500.\)
- \(\sum_{i=1}^{a}( \overline{Y}_{i.} - \overline{Y}_{..})^2 = 4650\), and \(\sum_{j=1}^{b}( \overline{Y}_{.j} - \overline{Y}_{..})^2 = 450.\)
- \(SSAB^* = \frac{(-13500)^2}{4650 * 450} = 87.1.\)
- \(SSRem^* = 10750 - 9300 - 1350 - 87.1 = 12.9.\)
- \(ab - a - b = 3*2 - 3 - 2 = 1.\)
- \(F\)-ratio for Tukey's test:
\[F_{Tukey}^{*} = \frac{SSAB^*/1}{SSRem^*/1} = \frac{87.1}{12.9} = 6.8.\] - When \(\alpha = 0.05, F(0.95; 1, 1) = 161.4 > 6.8.\)
- Thus, we can not reject \(H_o: D = 0\) at the 0.05 level, and we conclude that there is no significant interaction between the two factors.
- Indeed, the p-value is \(p = P(F_{1,1} > 6.8) = 0.23\) which is not at all significant.
Contributors
- Scott Brunstein (UCD)
- Debashis Paul (UCD)