Skip to main content
Statistics LibreTexts

9.2: Hypotheses of Change and Differences

  • Page ID
    7132
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    When we work with difference scores, our research questions have to do with change. Did scores improve? Did symptoms get better? Did prevalence go up or down? Our hypotheses will reflect this. Remember that the null hypothesis is the idea that there is nothing interesting, notable, or impactful represented in our dataset. In a paired samples t-test, that takes the form of ‘no change’. There is no improvement in scores or decrease in symptoms. Thus, our null hypothesis is:

    \(H_0\): There is no change or difference

    \(H_0: μD = 0\)

    As with our other null hypotheses, we express the null hypothesis for paired samples \(t\)-tests in both words and mathematical notation. The exact wording of the written-out version should be changed to match whatever research question we are addressing (e.g. “ There is no change in ability scores after training”). However, the mathematical version of the null hypothesis is always exactly the same: the average change score is equal to zero. Our population parameter for the average is still \(μ\), but it now has a subscript \(D\) to denote the fact that it is the average change score and not the average raw observation before or after our manipulation. Obviously individual difference scores can go up or down, but the null hypothesis states that these positive or negative change values are just random chance and that the true average change score across all people is 0.

    Our alternative hypotheses will also follow the same format that they did before: they can be directional if we suspect a change or difference in a specific direction, or we can use an inequality sign to test for any change:

    \(H_A\): There is a change or difference

    \(H_A: μD ≠ 0\)

    \(H_A\): The average score increases

    \(H_A: μD > 0\)

    \(H_A\): The average score decreases

    \(H_A: μD < 0\)

    As before, you choice of which alternative hypothesis to use should be specified before you collect data based on your research question and any evidence you might have that would indicate a specific directional (or non-directional) change.

    Critical Values and Decision Criteria

    As with before, once we have our hypotheses laid out, we need to find our critical values that will serve as our decision criteria. This step has not changed at all from the last chapter. Our critical values are based on our level of significance (still usually \(α\) = 0.05), the directionality of our test (one-tailed or two-tailed), and the degrees of freedom, which are still calculated as \(df = n – 1\). Because this is a \(t\)-test like the last chapter, we will find our critical values on the same \(t\)-table using the same process of identifying the correct column based on our significance level and directionality and the correct row based on our degrees of freedom or the next lowest value if our exact degrees of freedom are not presented. After we calculate our test statistic, our decision criteria are the same as well: \(p < α\) or \(t_{obt} > t*\).

    Test Statistic

    Our test statistic for our change scores follows exactly the same format as it did for our 1-sample \(t\)-test. In fact, the only difference is in the data that we use. For our change test, we first calculate a difference score as shown above. Then, we use those scores as the raw data in the same mean calculation, standard error formula, and \(t\)-statistic. Let’s look at each of these.

    The mean difference score is calculated in the same way as any other mean: sum each of the individual difference scores and divide by the sample size.

    \[\overline{X_{D}}=\dfrac{\Sigma X_{D}}{n} \]

    Here we are using the subscript \(D\) to keep track of that fact that these are difference scores instead of raw scores; it has no actual effect on our calculation. Using this, we calculate the standard deviation of the difference scores the same way as well:

    \[s_{D}=\sqrt{\dfrac{\sum\left(X_{D}-\overline{X_{D}}\right)^{2}}{n-1}}=\sqrt{\dfrac{S S}{d f}} \]

    We will find the numerator, the Sum of Squares, using the same table format that we learned in chapter 3. Once we have our standard deviation, we can find the standard error:

    \[s_{\overline{X}_{D}}=^{S_{D}} / \sqrt{n} \]

    Finally, our test statistic t has the same structure as well:

    \[t=\dfrac{\overline{X_{D}}-\mu_{D}}{s_{\overline{X}_{D}}} \]

    As we can see, once we calculate our difference scores from our raw measurements, everything else is exactly the same. Let’s see an example.


    This page titled 9.2: Hypotheses of Change and Differences is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Foster et al. (University of Missouri’s Affordable and Open Access Educational Resources Initiative) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

    • Was this article helpful?