Skip to main content
Statistics LibreTexts

2.4.2: Dependent Sample t-test Calculations

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)


    When we work with difference scores, our research questions have to do with change. Did scores improve? Did symptoms get better? Did prevalence go up or down? Our hypotheses will reflect this. As with our other hypotheses, we express the hypothesis for paired samples \(t\)-tests in both words and mathematical notation. The exact wording of the written-out version should be changed to match whatever research question we are addressing (e.g. “ There mean at Time 1 will be lower than the mean at Time 2 after training.”).

    Research Hypothesis

    Our research hypotheses will follow the same format that they did before:

    • Research Hypothesis: The average score increases, such that Time 1 will have a lower mean than Time 2.
      • Symbols: \(\overline{X}_{T1} < \overline{X}_{T2} \)


    • Research Hypothesis: The average score decreases, such that Time 1 will have a higher mean than Time 2.
      • Symbols: \(\overline{X}_{T1} > \overline{X}_{T2} \)

    When might you want scores to decrease? There are plenty of examples! Off the top of my head, I can imagine that a weight loss program would want lower scores after the program than before. Or a therapist might want their clients to score lower on a measure of depression (being less depressed) after the treatment. Or a police chief might want fewer citizen complaints after initiating a community advisory board than before the board. For mindset, we would want scores to be higher after the treament (more growth, less fixed).


    What Before/After test (pretest/post-test) can you think of for your future career? Would you expect scores to be higher or lower after the intervention?

    As before, you choice of which research hypothesis to use should be specified before you collect data based on your research question and any evidence you might have that would indicate a specific directional change. However, since we are just beginning to learn all of this stuff, Dr. MO might let you peak at the group means before you're asked for a research hypothesis.

    Null Hypothesis

    Remember that the null hypothesis is the idea that there is nothing interesting, notable, or impactful represented in our dataset. In a paired samples t-test, that takes the form of ‘no change’. There is no improvement in scores or decrease in symptoms. Thus, our null hypothesis is:

    • Null Hypothesis: The means of Time 1 and Time 2 will be similar; there is no change or difference.
    • Symbols: \(\overline{X}_{T1} = \overline{X}_{T2} \)

    The mathematical version of the null hypothesis is always exactly the same when comparing two means: the average score of one group is equal to the average score of another group.

    Critical Values and Decision Criteria

    As with before, once we have our hypotheses laid out, we need to find our critical values that will serve as our decision criteria. This step has not changed at all from the last chapter. Our critical values are based on our level of significance (still usually \(α\) = 0.05), the directionality of our test (still usually one-tailed), and the degrees of freedom. With degrees of freedom, we go back to \(df = N– 1\), but the "N" is the number of pairs. If you are doing a Before/After (pretest/post-test) design, the number of people will be the number of pairs. However, if you have matched pairs (say, 30 pairs of romantic partners), then N is the number of pairs (N = 30), even though the study has 60 people. Because this is a \(t\)-test like the last chapter, we will find our critical values on the same \(t\)-table using the same process of identifying the correct column based on our significance level and directionality and the correct row based on our degrees of freedom. After we calculate our test statistic, our decision criteria are the same as well:

    \((Critical < |Calculated|) =\) Reject null \(=\) means are different \(= p<.05\)

    \((Critical > |Calculated|) =\) Retain null \(=\) means are similar \(= p>.05\)

    Test Statistic

    Our test statistic for our change scores follows similar format as our prior \(t\)-tests; we subtract one mean from the other, and divide by a standard error. Sure, the formulas changes, but the idea stays the same.

    \[ \cfrac{\overline{X}_{D}}{\left(\cfrac{s_{D}}{\sqrt{N}} \right)} = \dfrac{\overline{X}_{D}}{SE} \nonumber \]

    This formula is mostly symbols of other formulas, so it’s only useful when you are provided mean of the difference (\( \overline{X}_{D}\)) and the standard deviation of the difference (\(s_{D}\)). Below, we'llgo through how to get the numerator and the denominator, then combine them into the full formula

    Numerator (Mean Differences)

    Let's start with the numerator (top) which deals with the mean differences (subtracting one mean from another).

    It turns out, you already found the mean differences! That's the Differences column in the table. But what we need is an average of the differences between the mean, so that looks like:

    \[\overline{X}_{D}=\dfrac{\Sigma {D}}{N} \nonumber \]

    The mean of the difference is calculated in the same way as any other mean: sum each of the individual difference scores and divide by the sample size. But remember, the sample size is the number of pairs! The D is the difference score for each pair.

    Denominator (Standard Error)

    The denominator is made of a the standard deviation of the differences and the square root of the sample size. Multiplying these together gives the standard error for a dependent t-test.

    Standard Deviation of the Difference

    The standard deviation of the difference is the same formula as the standard deviation for a sample, but using difference scores for each participant, instead of their raw scores.

    \[s_{D}=\sqrt{\dfrac{\sum\left((X_{D}-\overline{X}_{D})^{2}\right)}{N-1}}=\sqrt{\dfrac{S S}{d f}} \nonumber \]

    And just like in the standard deviation of a sample, the Sum of Squares (the numerator in the equation directly above) is most easily completed in the table of scores (and differences), using the same table format that we learned in chapter 3.

    Standard Error

    Once we have our standard deviation, we can find the standard error by multiplying the standard deviation of the differences with the square root of N (why we do this is beyond the scope of this book, but it's related to the sample size and the paired samples):

    \[\dfrac{s_{D}}{(\sqrt{N})} \nonumber \]

    Full Formula for Dependent t-test

    Finally, putting that all together, we can the full formula!

    \[ \cfrac{ \left(\cfrac{\Sigma {D}}{N}\right)} { {\sqrt{\left(\cfrac{\sum\left((X_{D}-\overline{X}_{D})^{2}\right)}{(N-1)}\right)} } \left(/ \sqrt{N}\right) } \nonumber \]

    Okay, I know that looks like a lot. And there are lots of parentheses to try to make clear the order of operations. But really, this is only finding a finding a mean of the difference, then dividing that by the standard deviation of the difference multiplied by the square-root of the number of pairs. Basically,

    1. Calculate the numerator (mean of the difference ( \(\bar{X}_{D}\) )), and
    2. Calculate the standard deviation of the difference (sD), then
    3. Multiply the standard deviation of the difference by the square root of the number of pairs, and
    4. Divide! Easy-peasy!

    Don't worry, we'll walk through a couple of examples so that you can see what this looks like next!

    This page titled 2.4.2: Dependent Sample t-test Calculations is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Michelle Oja.