3.7: One-Way ANOVA Greenhouse Example in R

Last updated
Save as PDF

Page ID: 33440

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

R Instructions: Code for the Greenhouse Data

Load the greenhouse data.
Calculate the overall mean, standard deviation, and standard error.
Calculate the mean, standard deviation, and standard error for each group.
Produce a boxplot to plot the differences in heights for each fertilizer.
Produce a "means plot" (interval plot) to view the differences in heights for each fertilizer.
Obtain the ANOVA table.
Obtain Tukey’s multiple comparisons CIs and difference in means plot.
Produce diagnostic (residuals) plots.
Power analysis.

setwd("~/path-to-folder/")
greenhouse_data<-read.table("greenhouse_data.txt",header=T)

Note that greenhouse data are in separate columns.

attach(greenhouse_data)
my_data<-stack(greenhouse_data)

With this command, we put our data in a stacked format (the first column has the response variable (values) and the second column has the treatment levels (ind).

To calculate the overall mean, standard deviation, and standard error we can use the following commands:

overall_mean<-mean(my_data$values) #26.16667
overall_sd<-sd(my_data$values) #3.685892
overall_standard_error<-overall_sd/sqrt(length(my_data$values)) #0.7523795

To calculate the group means we can use the following command:

group_means<-aggregate(my_data[, 1],list(my_data$ind),mean)
#  group_means
#  Group.1        x
# 1 Control 21.00000
# 2      F1 28.60000
# 3      F2 25.86667
# 4      F3 29.20000

To calculate the group standard deviations and standard errors we can use the following commands:

group_sd<-aggregate(my_data[, 1],list(my_data$ind),sd)
# group_sd  Group.1        x
# 1 Control 1.000000
# 2      F1 2.437212
# 3      F2 1.899123
# 4      F3 1.288410

group_standard_error<-group_sd$x/sqrt(length(my_data$ind)/4)
# group_standard_error
# 0.4082483 0.9949874 0.7753136 0.5259911

To produce the Boxplot we can use the following commands:

library("ggpubr")       
boxplot(values~ind,data=my_data,
xlab="Fertilizer",ylab="Plant Height",
main="Distribution of Plant Heights by Fertilizer",
frame=TRUE)

R-generated boxplot for distribution of plant heights by fertilizer. — Figure \(\PageIndex{1}\): Box plot for distribution of plant heights by fertilizer.

To produce the means plot (interval plot) we can use the following commands:

library("gplots")
plotmeans(values~ind,data=my_data,connect=FALSE,
xlab="Fertilizer",ylab="Plant Height",
main="Means Plot with 95% CI")

R-generated means plot with 95% confidence intervals, for plant height vs fertilizer. — Figure \(\PageIndex{2}\): Means plot with 95% confidence interval.

To obtain the ANOVA table we can use the following commands:

anova<-aov(values~ind,my_data)
summary(anova)

The command summary (anova) will give you the following output:

summary(anova)
Df Sum Sq Mean Sq F value   Pr(>F)    
ind          3 251.44   83.81   27.46 2.71e-07 ***
Residuals   20  61.03    3.05                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We can see the degrees of freedom in the first column, the sum of squares in the second column, the mean sum of squares in the third column, the \(F\)-test statistic in the fourth column, and finally, we can see the \(p\)-value.

Note that the output doesn't give the \(SSTO\). To find it, use the identity \(SSTO=SSR+SSE\). Similarly, for the \(df\) associated with \(SSTO\), add the \(df\) of \(SSR\) and \(SSE\).

For our example, \(SSTO=251.44+61.03=312.47\)

To obtain Tukey multiple comparisons of means with a 95% family-wise confidence level we use the following command:

library(multcomp)
library(multcompView)
tukey_multiple_comparisons<-TukeyHSD(anova,conf.level=0.95)
plot(tukey_multiple_comparisons)
tukey_multiple_comparisons
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = values ~ ind, data = my_data)
$ind
diff        lwr         upr     p adj
F1-Control  7.600000  4.7770648 10.42293521 0.0000016
F2-Control  4.866667  2.0437315  7.68960188 0.0005509
F3-Control  8.200000  5.3770648 11.02293521 0.0000005
F2-F1      -2.733333 -5.5562685  0.08960188 0.0598655
F3-F1       0.600000 -2.2229352  3.42293521 0.9324380
F3-F2       3.333333  0.5103981  6.15626854 0.0171033

Based on this output, the Control group is significantly different from the 3 treatment groups and F3 is significantly different from F2.

Plot of 95% family-wise confidence levels for the fertilizer data. — Figure \(\PageIndex{3}\): 95% famil-wise confidence level plot.

To produce diagnostic (residuals) plots we use the following commands:

#Residuals vs Fits plot
plot(anova,1)

R-generated plot of residuals vs fitted values. — Figure \(\PageIndex{6}\): Residuals vs fitted values plot.

#QQ plot
plot(anova,2)

R-generated plot of standardized residuals vs theoretical quantiles. — Figure \(\PageIndex{7}\): Normal Q-Q plot.

#Histogram of residuals
residuals<-anova$res #with this command we get the residuals from ANOVA
hist(residuals)

R-generated histogram of residuals. — Figure \(\PageIndex{8}\): Histogram of residuals.

Search

Text Color

Text Size

Margin Size

Font Type