8.4: Testing...

Last updated
Save as PDF

Page ID: 3599

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

The significance of difference between means for paired parametric data (t-test for paired data):

Code \(\PageIndex{1}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
t.test(data$WEIGHT, data$LENGTH, paired=TRUE)

... t-test for independent data:

Code \(\PageIndex{2}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
t.test(data$WEIGHT, data$LENGTH, paired=FALSE)

(Last example is for learning purpose only because our data is paired since every row corresponds with one animal. Also, "paired=FALSE" is the default for the t.test(), therefore one can skip it.)

Here is how to compare values of one character between two groups using formula interface:

Code \(\PageIndex{3}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
t.test(data$WEIGHT ~ data$SEX)

Formula was used because our weight/sex data is in the long form:

Code \(\PageIndex{4}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
data[, c("WEIGHT", "SEX")]

Convert weight/sex data into the short form and test:

Code \(\PageIndex{5}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
data3 <- unstack(data[, c("WEIGHT", "SEX")])
t.test(data3[[1]], data3[[2]])

(Note that test results are exactly the same. Only format was different.)

If the p-value is equal or less than 0.05, then the difference is statistically supported. R does not require you to check if the dispersion is the same.

Nonparametric Wilcoxon test for the differences:

Code \(\PageIndex{6}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
wilcox.test(data$WEIGHT, data$LENGTH, paired=TRUE)

One-way test for the differences between three and more groups (the simple variant of ANOVA, analysis of variation):

Code \(\PageIndex{7}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
wilcox.test(data$WEIGHT ~ data$SEX)

Which pair(s) are significantly different?

Code \(\PageIndex{8}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
pairwise.t.test(data$WEIGHT, data$COLOR, p.adj="bonferroni")

(We used Bonferroni correction for multiple comparisons.)

Nonparametric Kruskal-Wallis test for differences between three and more groups:

Code \(\PageIndex{9}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
kruskal.test(data$WEIGHT ~ data$COLOR)

Which pairs are significantly different in this nonparametric test?

Code \(\PageIndex{10}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
pairwise.wilcox.test(data$WEIGHT, data$COLOR)

The significance of the correspondence between categorical data (nonparametric Pearson chi-squared, or \(\chi^2\) test):

Code \(\PageIndex{11}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
chisq.test(data$COLOR, data$SEX)

The significance of proportions (nonparametric):

Code \(\PageIndex{12}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
prop.test(sum(data$SEX), length(data$SEX), 0.5)

(Here we checked if this is true that the proportion of male is different from 50%.)

The significance of linear correlation between variables, parametric way (Pearson correlation test):

Code \(\PageIndex{13}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
cor.test(data$WEIGHT, data$LENGTH, method="pearson")

... and nonparametric way (Spearman’s correlation test):

Code \(\PageIndex{14}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
cor.test(data$WEIGHT, data$LENGTH, method="spearman")

The significance (and many more) of the linear model describing relation of one variable on another:

Code \(\PageIndex{15}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
summary(lm(data$LENGTH ~ data$SEX))

... and analysis of variation (ANOVA) based on the linear model:

Code \(\PageIndex{16}\) (R):

data <- read.table("data/bugs.txt", h=TRUE)
aov(lm(data$LENGTH ~ data$SEX))