Skip to main content
Statistics LibreTexts

Experimental Design and Introduction to Analysis of Variance (LN 3)

  • Page ID
    208
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    An overview of experimental designs

    1. Complete randomized design (CRD): treatments (combinations of the factor levels of the different factors) are randomly assigned to the experimental units.
    Examples: one factor, two-factor, multi-factor studies (factorial designs).
    2. Crossed factor design: in a multi-factor study, the factors are crossed if all the combinations of all the factors are included in the study.
    Example: in a study involving two factors -- temperature (at three levels) and concentration of a solvent (at two levels) on the yield of a chemical process, all combinations of temperature and solvent concentrations are considered.
    Solvent conc. Temperature
    low medium high
    low x x x
    high x x x

    Table 1: Chemical yield study: Crossed factor design

    3. Nested design: one factor is nested within another factor in a multi-factor study.

    Example: in a study involving the effects of operators on a production process, three manufacturing plants and six operators are considered. However, the first two operators operate in plant 1, the next two in plant 2 and the last two in plant 3. Here, operators are said to be nested within manufacturing plants.
    Plant Operator
    1 2 3 4 5 6
    1 x x
    2 x x
    3 x x

    Table 2: Production study: Nested design

    4. Repeated measurement design: the same experimental unit receives all the treatment combinations. This helps to eliminate the effects of confounding factors associated with the experimental units.

    Example: taste-testing experiment where each tester rates all three brands of the breakfast cereals being tested.
    5. Randomized complete block design (RCBD): every treatment appears with every other treatment in the same block the same number of times and every block receives the full suit of treatments (possibly with replications), and the treatments are randomly assigned to the experimental units within each block. Used when the block sizes are integer multiples of the number of treatments.
    6. Balanced incomplete block design (BIBD): every treatment appears with every other treatment in the same block the same number of times, but every block does not receive the full suit of treatments. Used when the block sizes are smaller than the number of treatments. 7. Fractional factorial design, response surface experiments, etc.

    An overview of observational studies

    1. Cross sectional studies : measurements are taken from one or more populations or subpopulations at a single time point or time interval; and exposure to a potential causal factor and the response are determined simultaneously.
    Example: A study of incomes by gender in the decade 1981-1990, stratified by geographical locations.
    2. Cohort study (prospective): one or more groups are formed in a nonrandom manner according to some hypothesized causal factors, and then these groups are observed over time with respect to the outcome of interest. (What is going to happen ?)
    Example: A group of smokers and a group of non-smokers were followed since their early 30's, and their health indices were recorded over years until their late 60's.
    3. Case control study (retrospective study): groups are defined on the basis of an observed outcome, and the differences among the groups at an earlier time point are identified as potential causal effects. (What has happened?)
    Example: study of lung cancer (outcome) and smoking (explanatory factor) based on data collected on lung cancer patients.

    A single factor study

    A food company wanted to test four different package designs for a new break-fast cereal. 20 stores with approximately the same sales condition (such as sales volume, price, etc) were selected as experimental units. Five stores were randomly assigned to each of the 4 package designs.
    • A balanced complete randomized design.
    • A single, 4-level, qualitative factor: package design;
    • A quantitative response variable: sales -- number of packets of cereal sold during the period of study;
    • Goal: exploring relationship between package design and sales.

    Data

    The assignment of the different stores (indicated by letters A to T) to the package designs (D1 to D4) is given in the following table.
    Store IDs
    S1 S2 S3 S4

    S5

    D1 A B C D E
    Design

    D2

    F G H I J
    D3 K L M N O
    D4 P Q R S T
    The observed data on sales in the following table. Store O was dropped from the study because of a fire. As a result of this, the design is not balanced anymore.
    Store IDs
    S1 S2 S3 S4 S5
    D1 11 17 16 14 15
    Design D2 12 10 15 19 11
    D3 23 20 18 17 Miss
    D4 27 33 22 26 28

    ANOVA for single factor study

    A simple statistical model the data is as follows: $$ \large Y_{ij} = \mu_i + \varepsilon_{ij}, \qquad j=1,\ldots,n_i; ~~i=1,\ldots,r ; $$ where:
    • \(r \) is the number of factor levels (treatments) and \(n_i\) is the number of experimental units corresponding to the \(i\)-th factor level;
    • \(Y_{ij}\) is the measurement for the \(j\)-th experimental unit corresponding to the \(i\)-th factor level;
    • \( \mu_i\) is the mean of all the measurements corresponding to the \(i\)-th factor level (unknown);
    • \(\varepsilon_{ij}\)'s are random errors (unobserved).
    In this example, \(r=4\), \(n_1 = n_2 = n_4 =5\), \(n_3 = 4\); \(Y_{23} = 15\); \(\mu_2 =\) average sale in the population if Design 2 is used, where the population consists of all stores with similar sales condition as those in this study.

    Model Assumptions

    The following assumptions are made about the previous model:
    • \(\varepsilon_{ij}\) are independently and identically distributed as \(N(0,\sigma^2)\).
    • \( \mu_i\) 's are unknown fixed parameters (so called fixed effects), so that \(\mathbb{E}(Y_{ij}) = \mu_i\) and Var \((Y_{ij}) = \sigma^2\) . The above assumption is thus equivalent to assuming that \(Y_{ij}\) are independently distributed as \(N(\mu_i,\sigma^2)\).
    The assumption that the distribution is normal and the variances are equal play a crucial role in determining whether the means corresponding to the the different factor levels are the same.

    Interpretations

    • Factor level means (\(\mu_i\)): in an experimental study, the factor level mean \(\mu_i\) stands for the mean response that would be obtained if the \(i\)-th factor level were applied to the entire population from where the experimental units were sampled
    • Residual variance (\(\sigma^2\)) : refers to the variability among the responses if any given treatment were applied to the entire population.

    Steps in the anyalysis of factor level means

    1. Determine whether or not the factor level means are all the same: \( \large \mu_1=\cdots=\mu_r \)
    • What does \(\mu_1=\cdots=\mu_r\) mean? The factor has no effect on the distribution of the response variable.
    • How to evaluate the evidence of this statement \(\mu_1=\cdots=\mu_r\) based on observed data ?

    2. If the factor level means do differ, examine how they differ and what are the implications of these differences? (Chapter 17)

    In the example, we want to answer whether there is any effect of package design on sales. First step is the obtain estimates of the factor level means.

    Estimation of \(\mu_i\)

    Define, the sample mean for the \(i\)-th factor level: $$ \large \overline{Y}_{i\cdot} = \frac{1}{n_i} \sum_{j=1}^{n_i} Y_{ij} = \frac{1}{n_i} Y_{i\cdot} $$ where \(Y_{i\cdot} = \sum_{j=1}^{n_i} Y_{ij}\) is the sum of responses for the \(i\)-th treatment group, for \(i=1,\ldots,r\); and the overall sample mean: $$ \large \overline{Y}_{\cdot\cdot} = \frac{1}{\sum_{i=1}^r n_i} \sum_{i=1}^r \sum_{j=1}^{n_i} Y_{ij} = \frac{1}{\sum_{i=1}^r n_i} \sum_{i=1}^r n_i \overline{Y}_{i\cdot} = \frac{1}{\sum_{i=1}^r n_i} Y_{\cdot\cdot}~. $$ Then \( \overline{Y}_{i\cdot}\) is an estimate of \( \mu_i\) for each \(i=1,\ldots,I\). Under the assumptions, \(\overline{Y}_{i\cdot}\) is an unbiased estimator of \(\mu_i\) since $$ \large \mathbb{E}(\overline{Y}_{i\cdot}) = \frac{1}{n_i} \sum_{j=1}^{n_i} \mathbb{E}(Y_{ij}) = \frac{1}{n_i} \sum_{j=1}^{n_i} \mu_i = \mu_i. $$
    Store IDs Total Mean \( n_i \)
    S1 S2 S3 S4 S5 ( \(Y_{i.} \) ) ( \( \overline{Y_{i.}} \)
    D1 11 17 16 14 15 73 14.6 5
    Design D2 12 10 15 19 11 67 13.4 5
    D3 23 20 18 17 Miss 78 19.5 4

    D4

    27 33 22 26 28 136 27.2 5
    Total \(Y_{..} = 354 \) \( \overline{Y}_{..} = 18.63 \) 19
    It is easy to see that Designs 3 and 4 lead to larger sales than Designs 1 and 2. How to quantify these differences? How large is large?

    Pairwise comparison of factor level means

    Suppose we want to compare Designs 1 and 2. We can formulate this as a hypothesis testing problem for the following hypothesis: \( H_0 : \mu_1 = \mu_2\) against \( H_a : \mu_1 \neq \mu_2\). The standard test procedure is the two-sample \(z\) -test described below (assuming for the time being that \(\sigma\) is known).
    • Null hypothesis \( \large H_0 : \mu_1 = \mu_2\) tested against alternative hypothesis \( \large H_a : \mu_1 \neq \mu_2\).
    • The test procedure essentially asks the following question: is the observed difference \( \overline{Y}_{1\cdot} - \overline{Y}_{2\cdot}\) large enough to support the hypothesis \( H_a : \mu_1 \neq \mu_2\) ?
    The answer to this depends on the magnitude of Var\( (\overline{Y}_{1\cdot} - \overline{Y}_{2\cdot})\) which tells you what is the typical sampling variation. Note that $$ \large \mbox{Var}(\overline{Y}_{1\cdot} - \overline{Y}_{2\cdot}) = \frac{\sigma^2}{n_1} + \frac{\sigma^2}{n_2}. $$
    • The \(z\)-test statistic for \(H_0 : \mu_1 = \mu_2\) vs. \(H_a : \mu_1 \neq \mu_2\) is
    $$ \large Z = \frac{\overline{Y}_{1\cdot} - \overline{Y}_{2\cdot}}{\sqrt{\frac{\sigma^2}{n_1} + \frac{\sigma^2}{n_2}}} $$ which has a \(N(0,1)\) distribution if \(H_0\) is true. Thus, we reject \(H_0\) (more evidence towards \(\mu_1 \neq \mu_2\)) for large values of \(|Z|\). How large is determined by the level of significance \(\alpha\) (where \(0 < \alpha < 1\) is pre-specified).

    Contributors:

    • Valerie Regalia
    • Debashis Paul

    This page titled Experimental Design and Introduction to Analysis of Variance (LN 3) is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Debashis Paul.

    • Was this article helpful?