13.7: Using the t.test() Function
In this chapter, we’ve talked about three different kinds of t-test: the one sample test, the independent samples test (Student’s and Welch’s), and the paired samples test. In order to run these different tests, I’ve shown you three different functions:
oneSampleTTest()
,
independentSamplesTTest()
and
pairedSamplesTTest()
. I wrote these as three different functions for two reasons. Firstly, I thought it made sense to have separate functions for each test, in order to help make it clear to beginners that there
are
different tests. Secondly, I wanted to show you some functions that produced “verbose” output, to help you see what hypotheses are being tested and so on.
However, once you’ve started to become familiar with t-tests and with using R, you might find it easier to use the
t.test()
function. It’s one function, but it can run all four of the different t-tests that we’ve talked about. Here’s how it works. Firstly, suppose you want to run a one sample t-test. To run the test on the
grades
data from Dr Zeppo’s class (Section 13.2), we’d use a command like this:
t.test( x = grades, mu = 67.5 )
##
## One Sample t-test
##
## data: grades
## t = 2.2547, df = 19, p-value = 0.03615
## alternative hypothesis: true mean is not equal to 67.5
## 95 percent confidence interval:
## 67.84422 76.75578
## sample estimates:
## mean of x
## 72.3
The input is the same as for the
oneSampleTTest()
: we specify the sample data using the argument
x
, and the value against which it is to be tested using the argument
mu
. The output is a lot more compressed.
As you can see, it still has all the information you need. It tells you what type of test it ran and the data it tested it on. It gives you the t-statistic, the degrees of freedom and the p-value. And so on. There’s nothing wrong with this output, but in my experience it can be a little confusing when you’re just starting to learn statistics, because it’s a little disorganised. Once you know what you’re looking at though, it’s pretty easy to read off the relevant information.
What about independent samples t-tests? As it happens, the
t.test()
function can be used in much the same way as the
independentSamplesTTest()
function, by specifying a formula, a data frame, and using
var.equal
to indicate whether you want a Student test or a Welch test. If you want to run the Welch test from Section 13.4, then you’d use this command:
t.test( formula = grade ~ tutor, data = harpo )
##
## Welch Two Sample t-test
##
## data: grade by tutor
## t = 2.0342, df = 23.025, p-value = 0.05361
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.09249349 11.04804904
## sample estimates:
## mean in group Anastasia mean in group Bernadette
## 74.53333 69.05556
If you want to do the Student test, it’s exactly the same except that you need to add an additional argument indicating that
var.equal = TRUE
. This is no different to how it worked in the
independentSamplesTTest()
function.
Finally, we come to the paired samples t-test. Somewhat surprisingly, given that most R functions for dealing with repeated measures data require data to be in long form, the
t.test()
function isn’t really set up to handle data in long form. Instead it expects to be given two separate variables,
x
and
y
, and you need to specify
paired=TRUE
. And on top of that, you’d better make sure that the first element of
x
and the first element of
y
actually correspond to the same person! Because it doesn’t ask for an “id” variable. I don’t know why. So, in order to run the paired samples t test on the data from Dr Chico’s class, we’d use this command:
t.test( x = chico$grade_test2, # variable 1 is the "test2" scores
y = chico$grade_test1, # variable 2 is the "test1" scores
paired = TRUE # paired test
)
##
## Paired t-test
##
## data: chico$grade_test2 and chico$grade_test1
## t = 6.4754, df = 19, p-value = 3.321e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.9508686 1.8591314
## sample estimates:
## mean of the differences
## 1.405
Yet again, these are the same numbers that we saw in Section 13.5. Feel free to check.