3.9: Summary of important R code
- Page ID
- 33236
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)
3.9 Summary of important R code
The main components of R code used in this chapter follow with components to modify in lighter and/or ALL CAPS text, remembering that any R packages mentioned need to be installed and loaded for this code to have a chance of working:
- MODELNAME <- lm(Y ~ X, data = DATASETNAME)
- Probably the most frequently used command in R.
- Here it is used to fit the reference-coded One-Way ANOVA model with Y as the response variable and X as the grouping variable, storing the estimated model object in MODELNAME. Remember that X should be defined as a factor variable.
- MODELNAME <- lm(Y ~ X - 1, data = DATASETNAME)
- Fits the cell means version of the One-Way ANOVA model.
- summary(MODELNAME)
- Generates model summary information including the estimated model coefficients, SEs, t-tests, and p-values.
- anova(MODELNAME)
- Generates the ANOVA table but must only be run on the reference-coded version of the model.
- Results are incorrect if run on the cell means model since the reduced model under the null is that the mean of all the observations is 0!
- pf(FSTATISTIC, df1 = NUMDF, df2 = DENOMDF, lower.tail = F)
- Finds the p-value for an observed \(F\)-statistic with NUMDF and DENOMDF degrees of freedom.
-
Tobs
<-
anova(lm(Y ~ X, data = DATASETNAME))[1,4]; Tobs
B<-
1000
Tstar<-
matrix(NA, nrow = B)
for (b in (1:B)){
Tstar[b]<-
anova(lm(Y ~ shuffle(X), data = DATASETNAME))[1,4]
}
pdata(Tstar, Tobs, lower.tail = F)- Code to run a
for
loop to generate 1000 permuted F-statistics, store and calculate the permutation-based p-value fromTstar
.
- Code to run a
- par(mfrow = c(2,2)); plot(MODELNAME)
- Generates four diagnostic plots including the Residuals vs Fitted and Normal Q-Q plot.
- plot(allEffects(MODELNAME))
- Requires the
effects
package be loaded. - Plots the estimated model component.
- Requires the
- Tm2 <- glht(MODELNAME, linfct = mcp(X = “Tukey”)); confint(Tm2); plot(Tm2); summary(Tm2); cld(Tm2)
- Requires the
multcomp
package to be installed and loaded. - Can only be run on the reference-coded version of the model.
- Generates the text output and plot for Tukey’s HSD as well as the compact letter display information.
- Requires the