29.3: The t-test as a Linear Model (Section 28.3)
- Page ID
- 8872
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)
We can also use lm()
to implement these t-tests.
The one-sample t-test is basically a test for whether the intercept is different from zero, so we use a model with only an intercept and apply this to the data after subtracting the null hypothesis mean (so that the expectation under the null hypothesis is an intercept of zero):
NHANES_BP_sample <- NHANES_BP_sample %>%
mutate(BPSysAveDiff = BPSysAve-120)
lm_result <- lm(BPSysAveDiff ~ 1, data=NHANES_BP_sample)
summary(lm_result)
##
## Call:
## lm(formula = BPSysAveDiff ~ 1, data = NHANES_BP_sample)
##
## Residuals:
## Min 1Q Median 3Q Max
## -36.11 -13.11 -1.11 9.39 67.89
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.11 1.41 2.2 0.029 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18 on 154 degrees of freedom
You will notice that this p-value is twice as big as the one obtained from the one-sample t-test above; this is because that was a one-tailed test, while lm()
is performing a two-tailed test.
We can also run the two-sample t-test using lm()
:
lm_ttest_result <- lm(BPSysAve ~ SmokeNow, data=sample_df)
summary(lm_ttest_result)
##
## Call:
## lm(formula = BPSysAve ~ SmokeNow, data = sample_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -45.16 -11.16 -2.16 8.84 101.18
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 125.160 0.897 139.54 < 2e-16 ***
## SmokeNowYes -5.341 1.269 -4.21 2.8e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18 on 784 degrees of freedom
## Multiple R-squared: 0.0221, Adjusted R-squared: 0.0209
## F-statistic: 17.7 on 1 and 784 DF, p-value: 2.84e-05
This gives the same p-value for the SmokeNowYes variable as it did for the two-sample t-test above.