Loading [MathJax]/jax/element/mml/optable/GeneralPunctuation.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Statistics LibreTexts

Analysis of Factor Level Means and Contrasts

( \newcommand{\kernel}{\mathrm{null}\,}\)

1 Analysis of Factor Level Means

Suppose we reject H0:μ1=...=μr. Then, we want to investigate the nature of the differences amond the factor level means by studying the following:
  • One factor level mean: μi
  • Difference between two factor level means: D=μiμj
  • Contrast of factor level means: L=ri=1ciμi where ri=1ci=0

When more than one contrasts are involved, we also need to consider procedures that account for multiple comparisons, including:

  • Bonferroni's procedure
  • Tukey's procedure
  • Scheffe's procedure

1.1 Inference about one factor level mean

The ith factor level sample mean ˉYi is a point estimator of μi. Here are some properties of this estimator:
  • ˉYi is an unbiased estimator of E(ˉYi)=μi
  • Var(ˉYi)=σ2ni

MSE=SSE/(nTr) is a point estimator of σ2:

  • an unbiased estimator: E(MSE)=σ2
  • SSEχ2(nTr) and is independent of {ˉYi:i=1,...,r}

Thus,

  • the estimated standard error of ˉYi is

s(ˉYi)=MSEni

  • ˉYiμiMSEnit(nTr), i.e. a t-distribution with nTr degrees of freedom
  • A 100(1-α%) two sided confidence interval of μi is given by
    ˉYi±s(ˉYi)t(1α2;nTr)
    where t(1α2;nTr) denotes the 1α/2 quantile of the t-distribution with nTr degrees of freedom.
  • Test H0:μi=c against Ha:μic
    • T-statistics:

T=ˉYicMSEni

  • Under H0:T t(nTr)
  • At significance level α, reject H0 if
  • Confidence Interval Approach: If c does not belong to the 100(1-\alpha%) (two-sided) confidence interval of \mu_{i}, then reject H_{0} at level \alpha

1.2 Example

  • In the "package design" example, the estimate of \mu_{1} is \bar{Y}_{1\cdot} = 14.6
  • MSE = 10.55 and n_{1} = 5
  • Thus s(\bar{Y}_{1\cdot}) = \sqrt{10.55/5} = 1.45258
  • The degrees of freedom of MSE is 19 - 4 = 15 (since n_{T} = 19 and r = 4)
  • The 95% confidence interval of \mu_{1} is
    14.6 \pm 1.45258 x t(0.975; 15) = 14.6 \pm 1.45258 x 2.131 = 14.6 \pm 3.09545 = (11.50455, 17.69545)
    Here, from Table B.2, we get that t(0.975, 15) = 2.131 (or, use the R command: \textit{qt(0.675, 15)})

1.3 Interpretation of confidence intervals

  • A wrong statement: P(11.51 \leq \mu_{1} \leq 17.69 = 0.95. Why? Since "11.51 \leq \mu_{1} \leq 17.69" is true or false as a fact.
  • Interpretation of C.I.: if exactly the same study on package designs were repeated many times, and at each time a 95% C.I. for \mu_{1} were constructed as above, then about 95% of the time, the C.I. would contain the true value \mu_{1}.
  • A correct statement based on the observed data, we are 95% confident that \mu_{1} is in between 11.51 and 17.69
  • The difference between a random variable and its realiizations \bar{Y}_{1\cdot} is a random variable; 14.6 is the realization of \bar{Y}_{1\cdot} in the current sample

1.4 Difference between two factor level means

Let D = \mu_{i} - \mu_{j} for some i \neq j

  • \hat{D} = \bar{Y}_{i\cdot} - \bar{Y}_{j\cdot} is an unbiased estimator of D
  • Var(\hat{D}) = Var(\bar{Y}_{i\cdot}) + Var(\bar{Y}_{j\cdot}) = \sigma^{2}(\frac{1}{n_{i}} + \frac{1}{n_{j}}) (since \bar{Y}_{i\cdot} and bar{Y}_{j\cdot} are independent
  • estimated standard error of \hat{D} = s(\hat{D}) = \sqrt{MSE(1/n_{i} + 1/n_{j}}
  • for every \mu_{i} and \mu_{j}, the ratio frac{\hat{D} - D}{s(\hat{D})} has t_{(n_{T} - r)} distribution

1.5 Inference on the difference between two factor level means

  • 100(1 - \alpha)% (two-sided) confidence interval of D

\hat{D} \pm s(\hat{D})t(1 - \frac{\alpha}{2}; n_{T} - r)

Test H_{0}: D = 0 against H_{a}: D \neq 0. At the significance level \alpha, check whether
\hat{D} - s(\hat{D})t(1 - \frac{\alpha}{2}; n_{T} - r) \leq 0 \leq \hat{D} + s(\hat{D})t(1 - \frac{\alpha}{2}; n_{T} - r)
If not, reject H_{0} at level \alpha and conclude that the two means are different.

1.6 Example

In a study of the effectiveness of different rust inhibitors, four brands (1, 2, 3, 4) were tested. Altogether, 40 experimental units were randomly assigned to the four brands, with 10 units assigned to each brand. The resistance to rust was evaluated in a coded form after exposing the experimental units to severe conditions.

  • This is a balanced complete randomized design (CRD)

Summary statistics and ANOVA table

n_{1} = n_{2} = n_{3} = n_{4} =10 and \bar{Y}_{1\cdot} = 43.14, \bar{Y}_{2\cdot} = 89.44, \bar{Y}_{3\cdot} = 67.95, \bar{Y}_{4\cdot} = 40.47

Source of Variation Sum of Squares (SS) Degrees of Freedom (df) Mean of Squares (MS)
Between Treatments SSTR = 15953.47 r - 1 = 3 MST = 5317.82
Within Treatments SSE = 221.03 n_{T} - r = 36 MSE = 6.140
Total SSTO = 16174.50 n_{T} - 1 = 39

95% confidence interval for D = \mu_{1} - \mu_{2}

We compute

\hat{D} = \bar{Y}_{1\cdot} - \bar{Y}_{1\cdot} = 43.14 - 89.44 = -46.3

s(\hat{D}) = \sqrt{MSE(1/n_{1} = 1/n_{2})} = \sqrt{6.140(1/10 = 1/10)} = 1.108152

Also, since \alpha = 1 - 0.95 = 0.05, we have t(1-\alpha/2; n_{T} - r) = t(0.975; 36) = 2.028094 (use R command: \textit{qt(0.975, 36)}; or use Table B.2 and approximate the value by averaging the value of the 0.975-th quantile of t - distribution with degrees of freedom v = 30 and 40).
Therefore, the 95% confidence interval for D = \mu_{1} - \mu_{2} is

-46.3 \pm 1.108152 x 2.028094 = -46.3 \pm 2.247436 = (-48.54744, -44.05256)

2 Contrasts

A contrast is a linear combination of the factor level means: L = \sum_{i = 1}^{r}c_{i}\mu_{i} where c_{i}'s are prespecified constants with the constraint: \sum_{i=1}^{r}c_{i} = 0.

  • Examples:

- Pairwise comparisons: \mu_{i} - \mu_{j}

- \frac{\mu_{1} = \mu_{2}}{2} - \mu_{3}

  • Unbiased estimator:

\hat{L} = \sum_{i = 1}^{r}c_{i}\bar{Y}_{i\cdot}

  • Estimated standard error:

s(\hat{L}) = \sqrt{MSE\sum_{i = 1}^{r}\frac{c^{2}_{i}}{n_{i}}}
since Var(\hat{L}) = \sum_{i = 1}^{r}\sigma^{2}c^{2}_{i}/n_{i}.

2.1 Example of a contrast for the package design problem

Suppose, designs one and two are 3-color designs, while designs three and four are 5-color designs. The goal is to compare 3-color designs to 5-color designs in terms of sales.

  • Consider the contrast: L = \frac{\mu_{1} + \mu{2}}{2} - \frac{\mu_{3} + \mu_{4}}{2}
  • An unbiased point estimation of L is

\hat{L} = \frac{\bar{Y}_{1\cdot} + \bar{Y}_{2\cdot}}{2} - \frac{\bar{Y}_{3\cdot} + \bar{Y}_{4\cdot}}{2}
= \frac{14.6 + 13.4}{2} - \frac{19.5 + 27.2}{2} = -9.35

  • c_{1} = c_{2} = 0.5, c_{3} = c_{4} = -0.5 (note that, they add up to zero), so

s(\hat{L}) = \sqrt{MSE\sum_{i =1}^{r}\frac{c^{2}_{i}}{n_{i}}}
= \sqrt{10.55 x (\frac{(0.5)^{2}}{5} + \frac{(0.5)^{2}}{5} + \frac{(-0.5)^{2}}{5} +\frac{(-0.5)^{2}}{5})}
\sqrt{10.55 x 0.2125} = 1.5

  • A 90% C.I. for L is

\hat{L} \pm t(0.95; 15) x s(\hat{L})
= -9.35 \pm 1.5 x 1.753 = [-11.98, -6.72]

  • Since the 90% for L does not contain zero, we are 90% confident that 5-color designs work better than 3-color designs.

Contributors

  • Joy Wei, Debashis Paul


This page titled Analysis of Factor Level Means and Contrasts is shared under a not declared license and was authored, remixed, and/or curated by Debashis Paul.

Support Center

How can we help?