4.1: One-Sample t-Test

Last updated
Save as PDF

Page ID: 1734

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Learning Objectives

Use Student's \(t\)–test for one sample when you have one measurement variable and a theoretical expectation of what the mean should be under the null hypothesis. It tests whether the mean of the measurement variable is different from the null expectation.

There are several statistical tests that use the \(t\)-distribution and can be called a \(t\)-test. One is Student's \(t\)-test for one sample, named after "Student," the pseudonym that William Gosset used to hide his employment by the Guinness brewery in the early 1900s (they had a rule that their employees weren't allowed to publish, and Guinness didn't want other employees to know that they were making an exception for Gosset). Student's \(t\)-test for one sample compares a sample to a theoretical mean. It has so few uses in biology that I didn't cover it in previous editions of this Handbook, but then I recently found myself using it (McDonald and Dunn 2013), so here it is.

When to use it

Use Student's \(t\)-test when you have one measurement variable, and you want to compare the mean value of the measurement variable to some theoretical expectation. It is commonly used in fields such as physics (you've made several observations of the mass of a new subatomic particle—does the mean fit the mass predicted by the Standard Model of particle physics?) and product testing (you've measured the amount of drug in several aliquots from a new batch—is the mean of the new batch significantly less than the standard you've established for that drug?). It's rare to have this kind of theoretical expectation in biology, so you'll probably never use the one-sample \(t\)-test.

I've had a hard time finding a real biological example of a one-sample \(t\)-test, so imagine that you're studying joint position sense, our ability to know what position our joints are in without looking or touching. You want to know whether people over- or underestimate their knee angle. You blindfold \(10\) volunteers, bend their knee to a \(120^{\circ}\) angle for a few seconds, then return the knee to a \(90^{\circ}\) angle. Then you ask each person to bend their knee to the \(120^{\circ}\) angle. The measurement variable is the angle of the knee, and the theoretical expectation from the null hypothesis is \(120^{\circ}\). You get the following imaginary data:

Individual	Angle
A	120.6
B	116.4
C	117.2
D	118.1
E	114.1
F	116.9
G	113.3
H	121.1
I	116.9
J	117.0

If the null hypothesis were true that people don't over- or underestimate their knee angle, the mean of these \(10\) numbers would be \(120\). The mean of these ten numbers is \(117.2\); the one-sample \(t\)–test will tell you whether that is significantly different from \(120\).

Null hypothesis

The statistical null hypothesis is that the mean of the measurement variable is equal to a number that you decided on before doing the experiment. For the knee example, the biological null hypothesis is that people don't under- or overestimate their knee angle. You decided to move people's knees to \(120^{\circ}\), so the statistical null hypothesis is that the mean angle of the subjects' knees will be \(120^{\circ}\).

How the test works

Calculate the test statistic,\(t_s\), using this formula:

\[t_s=\frac{(\bar{x}-\mu _\theta )}{(s/\sqrt{n})}\]

where \(\bar{x}\) is the sample mean, \(\mu\) is the mean expected under the null hypothesis, \(s\) is the sample standard deviation and \(n\) is the sample size. The test statistic, \(t_s\), gets bigger as the difference between the observed and expected means gets bigger, as the standard deviation gets smaller, or as the sample size gets bigger.

Applying this formula to the imaginary knee position data gives a \(t\)-value of \(-3.69\).

You calculate the probability of getting the observed \(t_s\) value under the null hypothesis using the t-distribution. The shape of the \(t\)-distribution, and thus the probability of getting a particular \(t_s\) value, depends on the number of degrees of freedom. The degrees of freedom for a one-sample \(t\)-test is the total number of observations in the group minus \(1\). For our example data, the \(P\) value for a \(t\)-value of \(-3.69\) with \(9\) degrees of freedom is \(0.005\), so you would reject the null hypothesis and conclude that people return their knee to a significantly smaller angle than the original position.

Assumptions

The \(t\)-test assumes that the observations within each group are normally distributed. If the distribution is symmetrical, such as a flat or bimodal distribution, the one-sample \(t\)-test is not at all sensitive to the non-normality; you will get accurate estimates of the \(P\) value, even with small sample sizes. A severely skewed distribution can give you too many false positives unless the sample size is large (above \(50\) or so). If your data are severely skewed and you have a small sample size, you should try a data transformation to make them less skewed. With large sample sizes (simulations I've done suggest \(50\) is large enough), the one-sample \(t\)-test will give accurate results even with severely skewed data.

Example

McDonald and Dunn (2013) measured the correlation of transferrin (labeled red) and Rab-10 (labeled green) in five cells. The biological null hypothesis is that transferrin and Rab-10 are not colocalized (found in the same subcellular structures), so the statistical null hypothesis is that the correlation coefficient between red and green signals in each cell image has a mean of zero. The correlation coefficients were \(0.52,\; 0.20,\; 0.59,\; 0.62\) and \(0.60\) in the five cells. The mean is \(0.51\), which is highly significantly different from \(0\) (\(t=6.46,\; 4d.f.,\; P=0.003\)), indicating that transferrin and Rab-10 are colocalized in these cells.

Graphing the results

Because you're just comparing one observed mean to one expected value, you probably won't put the results of a one-sample \(t\)-test in a graph. If you've done a bunch of them, I guess you could draw a bar graph with one bar for each mean, and a dotted horizontal line for the null expectation.

Similar tests

The paired t–test is a special case of the one-sample \(t\)-test; it tests the null hypothesis that the mean difference between two measurements (such as the strength of the right arm minus the strength of the left arm) is equal to zero. Experiments that use a paired t–test are much more common in biology than experiments using the one-sample \(t\)-test, so I treat the paired \(t\)-test as a completely different test.

The two-sample t–test compares the means of two different samples. If one of your samples is very large, you may be tempted to treat the mean of the large sample as a theoretical expectation, but this is incorrect. For example, let's say you want to know whether college softball pitchers have greater shoulder flexion angles than normal people. You might be tempted to look up the "normal" shoulder flexion angle (\(150^{\circ}\)) and compare your data on pitchers to the normal angle using a one-sample \(t\)-test. However, the "normal" value doesn't come from some theory, it is based on data that has a mean, a standard deviation, and a sample size, and at the very least you should dig out the original study and compare your sample to the sample the \(150^{\circ}\) "normal" was based on, using a two-sample \(t\)-test that takes the variation and sample size of both samples into account.

How to do the test

Spreadsheets

I have set up a spreadsheet to perform the one-sample \(t\)–test onesamplettest.xls. It will handle up to \(1000\) observations.

Web pages

There are web pages to do the one-sample \(t\)–test here and here.

R

Salvatore Mangiafico's \(R\) Companion has a sample R program for the one-sample t–test.

SAS

You can use PROC TTEST for Student's \(t\)-test; the CLASS parameter is the nominal variable, and the VAR parameter is the measurement variable. Here is an example program for the joint position sense data above. Note that \(H0\) parameter for the theoretical value is \(H\) followed by the numeral zero, not a capital letter \(O\).

DATA jps;
INPUT angle;
DATALINES;
120.6
116.4
117.2
118.1
114.1
116.9
113.3
121.1
116.9
117.0
;
PROC TTEST DATA=jps H0=50;
VAR angle;
RUN;

The output includes some descriptive statistics, plus the \(t\)-value and \(P\) value. For these data, the \(P\) value is \(0.005\).

DF t Value Pr > |t|
9 -3.69 0.0050

Power analysis

To estimate the sample size you to detect a significant difference between a mean and a theoretical value, you need the following:

the effect size, or the difference between the observed mean and the theoretical value that you hope to detect
the standard deviation
alpha, or the significance level (usually \(0.05\))
beta, the probability of accepting the null hypothesis when it is false (\(0.50,\; 0.80\) and \(0.90\) are common values)

The G*Power program will calculate the sample size needed for a one-sample \(t\)-test. Choose "t tests" from the "Test family" menu and "Means: Difference from constant (one sample case)" from the "Statistical test" menu. Click on the "Determine" button and enter the theoretical value ("Mean \(H0\)") and a mean with the smallest difference from the theoretical that you hope to detect ("Mean \(H1\)"). Enter an estimate of the standard deviation. Click on "Calculate and transfer to main window". Change "tails" to two, set your alpha (this will almost always be \(0.05\)) and your power (\(0.5,\; 0.8,\; or\; 0.9\) are commonly used).

As an example, let's say you want to follow up the knee joint position sense study that I made up above with a study of hip joint position sense. You're going to set the hip angle to \(70^{\circ}\) (Mean \(H0=70\)) and you want to detect an over- or underestimation of this angle of \(1^{\circ}\), so you set Mean \(H1=71\). You don't have any hip angle data, so you use the standard deviation from your knee study and enter \(2.4\) for SD. You want to do a two-tailed test at the \(P<0.05\) level, with a probability of detecting a difference this large, if it exists, of \(90\%\) (\(1-\text {beta}=0.90\)). Entering all these numbers in G*Power gives a sample size of \(63\) people.

Reference

McDonald, J.H., and K.W. Dunn. 2013. Statistical tests for measures of colocalization in biological microscopy. Journal of Microscopy 252: 295-302.