Experimental Design and Introduction to Analysis of Variance (LN 3)

Last updated
Save as PDF

Page ID: 208

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $ $ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $$\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$ $\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$

An overview of experimental designs

1. Complete randomized design (CRD): treatments (combinations of the factor levels of the different factors) are randomly assigned to the experimental units.

Examples: one factor, two-factor, multi-factor studies (factorial designs).

2. Crossed factor design: in a multi-factor study, the factors are crossed if all the combinations of all the factors are included in the study.

Example: in a study involving two factors -- temperature (at three levels) and concentration of a solvent (at two levels) on the yield of a chemical process, all combinations of temperature and solvent concentrations are considered.

Solvent conc.		Temperature
	low	medium	high
low	x	x	x
high	x	x	x

Table 1: Chemical yield study: Crossed factor design

3. Nested design: one factor is nested within another factor in a multi-factor study.

Example: in a study involving the effects of operators on a production process, three manufacturing plants and six operators are considered. However, the first two operators operate in plant 1, the next two in plant 2 and the last two in plant 3. Here, operators are said to be nested within manufacturing plants.

Plant			Operator
	1	2	3	4	5	6
1	x	x
2			x	x
3					x	x

Table 2: Production study: Nested design

4. Repeated measurement design: the same experimental unit receives all the treatment combinations. This helps to eliminate the effects of confounding factors associated with the experimental units.

Example: taste-testing experiment where each tester rates all three brands of the breakfast cereals being tested.

5. Randomized complete block design (RCBD): every treatment appears with every other treatment in the same block the same number of times and every block receives the full suit of treatments (possibly with replications), and the treatments are randomly assigned to the experimental units within each block. Used when the block sizes are integer multiples of the number of treatments.

6. Balanced incomplete block design (BIBD): every treatment appears with every other treatment in the same block the same number of times, but every block does not receive the full suit of treatments. Used when the block sizes are smaller than the number of treatments. 7. Fractional factorial design, response surface experiments, etc.

An overview of observational studies

1. Cross sectional studies : measurements are taken from one or more populations or subpopulations at a single time point or time interval; and exposure to a potential causal factor and the response are determined simultaneously.

Example: A study of incomes by gender in the decade 1981-1990, stratified by geographical locations.

2. Cohort study (prospective): one or more groups are formed in a nonrandom manner according to some hypothesized causal factors, and then these groups are observed over time with respect to the outcome of interest. (What is going to happen ?)

Example: A group of smokers and a group of non-smokers were followed since their early 30's, and their health indices were recorded over years until their late 60's.

3. Case control study (retrospective study): groups are defined on the basis of an observed outcome, and the differences among the groups at an earlier time point are identified as potential causal effects. (What has happened?)

Example: study of lung cancer (outcome) and smoking (explanatory factor) based on data collected on lung cancer patients.

A single factor study

A food company wanted to test four different package designs for a new break-fast cereal. 20 stores with approximately the same sales condition (such as sales volume, price, etc) were selected as experimental units. Five stores were randomly assigned to each of the 4 package designs.

A balanced complete randomized design.
A single, 4-level, qualitative factor: package design;
A quantitative response variable: sales -- number of packets of cereal sold during the period of study;
Goal: exploring relationship between package design and sales.

Data

The assignment of the different stores (indicated by letters A to T) to the package designs (D1 to D4) is given in the following table.

			Store IDs
		S1	S2	S3	S4	S5
	D1	A	B	C	D	E
Design	D2	F	G	H	I	J
	D3	K	L	M	N	O
	D4	P	Q	R	S	T

The observed data on sales in the following table. Store O was dropped from the study because of a fire. As a result of this, the design is not balanced anymore.

			Store IDs
		S1	S2	S3	S4	S5
	D1	11	17	16	14	15
Design	D2	12	10	15	19	11
	D3	23	20	18	17	Miss
	D4	27	33	22	26	28

ANOVA for single factor study

A simple statistical model the data is as follows: $$ \large Y_{ij} = \mu_i + \varepsilon_{ij}, \qquad j=1,\ldots,n_i; ~~i=1,\ldots,r ; $$ where:

$r $ is the number of factor levels (treatments) and $n_i$ is the number of experimental units corresponding to the $i$-th factor level;
$Y_{ij}$ is the measurement for the $j$-th experimental unit corresponding to the $i$-th factor level;
$ \mu_i$ is the mean of all the measurements corresponding to the $i$-th factor level (unknown);

$\varepsilon_{ij}$'s are random errors (unobserved).

In this example, $r=4$, $n_1 = n_2 = n_4 =5$, $n_3 = 4$; $Y_{23} = 15$; $\mu_2 =$ average sale in the population if Design 2 is used, where the population consists of all stores with similar sales condition as those in this study.

Model Assumptions

The following assumptions are made about the previous model:

$\varepsilon_{ij}$ are independently and identically distributed as $N(0,\sigma^2)$.

$ \mu_i$ 's are unknown fixed parameters (so called fixed effects), so that $\mathbb{E}(Y_{ij}) = \mu_i$ and Var $(Y_{ij}) = \sigma^2$ . The above assumption is thus equivalent to assuming that $Y_{ij}$ are independently distributed as $N(\mu_i,\sigma^2)$.

The assumption that the distribution is normal and the variances are equal play a crucial role in determining whether the means corresponding to the the different factor levels are the same.

Interpretations

Factor level means ($\mu_i$): in an experimental study, the factor level mean $\mu_i$ stands for the mean response that would be obtained if the $i$-th factor level were applied to the entire population from where the experimental units were sampled
Residual variance ($\sigma^2$) : refers to the variability among the responses if any given treatment were applied to the entire population.

Steps in the anyalysis of factor level means

Determine whether or not the factor level means are all the same: $ \large \mu_1=\cdots=\mu_r $

What does $\mu_1=\cdots=\mu_r$ mean? The factor has no effect on the distribution of the response variable.
How to evaluate the evidence of this statement $\mu_1=\cdots=\mu_r$ based on observed data ?

2. If the factor level means do differ, examine how they differ and what are the implications of these differences? (Chapter 17)

In the example, we want to answer whether there is any effect of package design on sales. First step is the obtain estimates of the factor level means.

Estimation of $\mu_i$

Define, the sample mean for the $i$-th factor level: $$ \large \overline{Y}_{i\cdot} = \frac{1}{n_i} \sum_{j=1}^{n_i} Y_{ij} = \frac{1}{n_i} Y_{i\cdot} $$ where $Y_{i\cdot} = \sum_{j=1}^{n_i} Y_{ij}$ is the sum of responses for the $i$-th treatment group, for $i=1,\ldots,r$; and the overall sample mean: $$ \large \overline{Y}_{\cdot\cdot} = \frac{1}{\sum_{i=1}^r n_i} \sum_{i=1}^r \sum_{j=1}^{n_i} Y_{ij} = \frac{1}{\sum_{i=1}^r n_i} \sum_{i=1}^r n_i \overline{Y}_{i\cdot} = \frac{1}{\sum_{i=1}^r n_i} Y_{\cdot\cdot}~. $$ Then $ \overline{Y}_{i\cdot}$ is an estimate of $ \mu_i$ for each $i=1,\ldots,I$. Under the assumptions, $\overline{Y}_{i\cdot}$ is an unbiased estimator of $\mu_i$ since $$ \large \mathbb{E}(\overline{Y}_{i\cdot}) = \frac{1}{n_i} \sum_{j=1}^{n_i} \mathbb{E}(Y_{ij}) = \frac{1}{n_i} \sum_{j=1}^{n_i} \mu_i = \mu_i. $$

				Store IDs			Total	Mean	$ n_i $
		S1	S2	S3	S4	S5	( $Y_{i.} $ )	( $ \overline{Y_{i.}} $
	D1	11	17	16	14	15	73	14.6	5
Design	D2	12	10	15	19	11	67	13.4	5
	D3	23	20	18	17	Miss	78	19.5	4
	D4	27	33	22	26	28	136	27.2	5
	Total						$Y_{..} = 354 $	$ \overline{Y}_{..} = 18.63 $	19

It is easy to see that Designs 3 and 4 lead to larger sales than Designs 1 and 2. How to quantify these differences? How large is large?

Pairwise comparison of factor level means

Suppose we want to compare Designs 1 and 2. We can formulate this as a hypothesis testing problem for the following hypothesis: $ H_0 : \mu_1 = \mu_2$ against $ H_a : \mu_1 \neq \mu_2$. The standard test procedure is the two-sample $z$ -test described below (assuming for the time being that $\sigma$ is known).

Null hypothesis $ \large H_0 : \mu_1 = \mu_2$ tested against alternative hypothesis $ \large H_a : \mu_1 \neq \mu_2$.

The test procedure essentially asks the following question: is the observed difference $ \overline{Y}_{1\cdot} - \overline{Y}_{2\cdot}$ large enough to support the hypothesis $ H_a : \mu_1 \neq \mu_2$ ?

The answer to this depends on the magnitude of Var$ (\overline{Y}_{1\cdot} - \overline{Y}_{2\cdot})$ which tells you what is the typical sampling variation. Note that $$ \large \mbox{Var}(\overline{Y}_{1\cdot} - \overline{Y}_{2\cdot}) = \frac{\sigma^2}{n_1} + \frac{\sigma^2}{n_2}. $$

The $z$-test statistic for $H_0 : \mu_1 = \mu_2$ vs. $H_a : \mu_1 \neq \mu_2$ is

$$ \large Z = \frac{\overline{Y}_{1\cdot} - \overline{Y}_{2\cdot}}{\sqrt{\frac{\sigma^2}{n_1} + \frac{\sigma^2}{n_2}}} $$ which has a $N(0,1)$ distribution if $H_0$ is true. Thus, we reject $H_0$ (more evidence towards $\mu_1 \neq \mu_2$) for large values of $|Z|$. How large is determined by the level of significance $\alpha$ (where $0 < \alpha < 1$ is pre-specified).

Contributors:

Valerie Regalia
Debashis Paul