Skip to main content
Statistics LibreTexts

8.1: Paired Samples

  • Page ID
    49042
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)


    In this lesson, we will make inferences about the population mean difference between two quantitative measurements. We will learn how to construct interval estimates and test hypotheses using paired data.
     

    Paired/Dependent and Independent Data

    Consider the two scenarios below.

    Example 1

    The planet’s average surface temperature has risen about 2 degrees Fahrenheit (1 degree Celsius) since the late 19th century. Earth’s global average surface temperature in 2021 tied with 2018 as the sixth warmest year on record, according to an analysis by NASA14. In congress, there are politicians who believe that global warming is a hoax and this belief drives important policy decisions. In 2005, The New York Times reported15 that Michael Crichton, a novelist, was called to testify before the Senate Committee on Environment and Public Works. The chairman of the committee, Senator James M. Inhofe, who said that global warming is “the greatest hoax ever perpetrated on the American people,” had the committee read Crichton’s fictional novel “State of Fear” (an environmental thriller that casts doubt on the idea that human activities contribute to global warming). Crichton asserted that cooling observed in the interior of Antarctica shows the lack of reliability of models used for global warming predictions and of climate science in general. The book remains one of the most cited works in climate change skeptic circles.

    Does the data in “State of Fear” disprove climate change? In the book, a graph of temperatures in Punta Arenas is shown for the last 116 years. The graph has a downward trend and suggests that the temperature is actually dropping over time. To investigate this claim, we randomly sample 32 pairs of latitudes and longitudes, find the nearest station for each set of coordinates, and measure the average temperature over two consecutive blocks of time.

     

    AD_4nXfgDQ-Y5zgjyUVyWbz3MHxDgIsnn8nbKf0I_n6I5-7TrksRF05KOs8TtPTaakoY9eZEa-5gRAsnhV0RL9mirqbI1Byj60Zh7-Rnb_LXzCT8BcsQzdL6UP7mMlvHavaSZsDyXrXGfG_WtDBCulB5rblHWWAJkeyi1XJeTDlU718V25snr3PRQ

     

     

     

     

     

     

     

     

     

     

     

     

    Example 2

    Los Angeles Daily News reported16 that “Low-income neighborhoods with higher Black, Hispanic, and Asian populations experience significantly more urban heat than wealthier and predominantly white neighborhoods in Southern California and within a vast majority of populous U.S. counties.” According to a study17, “roughly 25% of all natural hazard mortality in the U.S. is due to heat exposure (Borden & Cutter, 2008) and heat waves are becoming more frequent, more intense, and are longer in season (Shiva et al.,2019; Wobus et al., 2018); understanding who is affected by urban heating and what drives exposure disparities is therefore critical for crafting just and effective policy responses, particularly under warming climate conditions.”

     

    AD_4nXfjvgOTgmQDAgvhVni24impC2HgPm2VPldjjpH54HgCfhQlwZT1S86CgZM_Jt5e1942DmZQUBWrcoY83Q8Ia2zTzups_EHgPlJCv8ZS2QRYjt9I9UX-LBnlWXFxTMOkv_3OyDZH7p_si9DKsRKB-dPIwg8keyi1XJeTDlU718V25snr3PRQ
     

    A statistics student wants to know the mean difference in land surface temperature between poor and affluent communities. They randomly sample 556 counties with high rates of poverty and 500 affluent counties. They find the average temperature and sample standard deviation for each of the groups and use the sample data to estimate the population mean difference in temperature between poor and affluent communities.

    Summary

    1. Notice any differences and similarities between the two examples.

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

    In both examples, two sets of data were collected. In example 1, the average temperature is measured twice (from 1901-1950 and from 1951-2000) for each randomly chosen station location. In example 2, the average temperature is measured once for each group. In example 1, the two data sets are directly related in pairs. We call such data paired or dependent. A sample is paired if each subject in the sample is measured twice. In example 2, the two data sets are not directly related. The values of one set have no effect on the values of the other set. Such data is referred to as independent.

    Identify Paired Samples

    In the following questions, identify whether the question is solved by constructing a confidence interval or conducting a hypothesis test, and if this requires data from a paired sample or from two independent samples. Explain how you made your decision.

    1. Jason claims that a higher proportion of males pass their drivers test in the first attempt than the proportion of females pass the test in the first attempt.

       

       

       

       

       

    2. It is believed that the average grade on an English essay in a particular school system for females is higher than for males. A random sample of 31 females had a mean score of 82 with a standard deviation of three, and a random sample of 25 males had a mean score of 76 with a standard deviation of four. Estimate the average difference in grades for females and males.

       

       

       

       

       

    3. Eight subjects are picked at random and given a new sleep medication. The mean hours slept for each person were recorded before starting the medication and after. Estimate the mean difference in hours slept before and after use of the sleep medication.

       

       

       

       

       

       

    4. A new WiFi range booster is being offered to consumers. A researcher tests the native range of 12 different routers under the same conditions. The ranges are recorded. Then the researcher uses the new WiFi range booster and records the new ranges. Does the new WiFi range booster do a better job?

       

       

       

       

       

       

    Construct a Confidence Interval for the Mean Difference

    In example 1, we want to determine if the earth is warming. We randomly sample 32 pairs of latitudes and longitudes, find the nearest station for each set of coordinates, and measure the average temperature over two consecutive blocks of time.

    1. At the Punta Arenas station, the temperature cooled. Do you think that this difference is representative of climate patterns across the world?

      clipboard_eb2451b06d835baa63032d25cf93714fb.png

      Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

       

       

       

       

       

       

       

    2. Let’s estimate the mean temperature change for the earth with 95% confidence. The mean difference from our sample, \(\bar{x}=0.452\) degrees celsius, and the sample standard deviation for the differences is s=0.4361 degrees celsius.

    Step 1 Is the sampling distribution approximately normal? Explain.

     

     

     

     

     

    Step 2 Find the critical value (rounded to three decimal places) that corresponds to a ____ % confidence level and ______ degrees of freedom.

    \(T_c=\operatorname{tdist}(\underline{\ \ \ \ \ \ \ \ \ \ }) \cdot \operatorname{inversecdf}(\underline{\ \ \ \ \ \ \ \ \ \ })=\underline{\ \ \ \ \ \ \ \ \ \ }\)


    \(n=\underline{\ \ \ \ \ \ \ \ \ \ }\)


    \(df=\underline{\ \ \ \ \ \ \ \ \ \ }\)


    \(\bar{x}=\underline{\ \ \ \ \ \ \ \ \ \ }\)


    \(s=\underline{\ \ \ \ \ \ \ \ \ \ }\)
     

    Step 3 The margin of error rounded to three decimal places is

    \(E=T_c \cdot \dfrac{s}{\sqrt{n}}=\underline{\ \ \ \ \ \ \ \ \ \ }\cdot\dfrac{\underline{\ \ \ \ \ \ \ \ \ \ }}{\displaystyle\sqrt{\underline{\ \ \ \ \ \ \ \ \ \ }}}=\underline{\ \ \ \ \ \ \ \ \ \ }\)

    Step 4 The interval is

    \((\bar{x}-E, \bar{x}+E)=(\underline{\ \ \ \ \ \ \ \ \ \ }-\underline{\ \ \ \ \ \ \ \ \ \ },\ \underline{\ \ \ \ \ \ \ \ \ \ }+\underline{\ \ \ \ \ \ \ \ \ \ })=(\underline{\ \ \ \ \ \ \ \ \ \ }, \underline{\ \ \ \ \ \ \ \ \ \ })\)

    Step 5 State the conclusion in context: 

     

     

     

     

     

    Based on the interval, do you think the earth is warming or cooling? Explain.

     

     

     

     

     

     

     

     

    Conduct a Hypothesis Test about the Mean Difference

    Stations showed warming, on average. Is the average temperature change significantly high? Let’s test this claim at a 5% level of significance.

    clipboard_e61058a64ad3c464ce33dc4b4f66ae97b.png

    Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

    Step 1 Let \(\mu\) represent the mean global temperature change.

    \(H_0: \mu=0\)

    \(H_a:\underline{\ \ \ \ \ \ \ \ \ \ }\ \underline{\ \ \ \ \ \ \ \ \ \ }\ \underline{\ \ \ \ \ \ \ \ \ \ }\)

    We will conduct a right-tailed test.

    Step 2 Is the sampling distribution approximately normal? Explain.

    \(n=\underline{\ \ \ \ \ \ \ \ \ \ }\)

    \(df=\underline{\ \ \ \ \ \ \ \ \ \ }\)

    \(\bar{x}=\underline{\ \ \ \ \ \ \ \ \ \ }\)

    \(s=\underline{\ \ \ \ \ \ \ \ \ \ }\)

    Step 3 Compute the test statistic rounded to three decimal places. Use it to compute the P-value.

    \(T=\dfrac{\bar{x}-\mu}{\dfrac{s}{\sqrt{n}}}=\dfrac{\underline{\ \ \ \ \ \ \ \ \ \ }}{\dfrac{\underline{\ \ \ \ \ \ \ \ \ \ }}{\sqrt{\underline{\ \ \ \ \ \ \ \ \ \ }}}}=\underline{\ \ \ \ \ \ \ \ \ \ }\)

    P-value is ______________

    Step 4 State the conclusion in context:

     

     

    Reference

    14Climate Change: Vital Signs of the Planet. 2022. Climate Change Evidence: How Do We Know? accessed June 28 2022, https://climate.nasa.gov/evidence/

    15 “Michael Crichton, Novelist, Becomes Senate Witness,” Michael K Janofsky, Sept 29, 2005, accessed June 28, 2022, https://www.nytimes.com/2005/09/29/books/michael-crichton-novelist-becomes-senate-witness.html

    16“Poor neighborhoods get up to 7° hotter than rich ones in Southern California, study finds,” July 13, 2021, accessed June 28, 2022, https://www.dailynews.com/2021/07/13/poor-southern-california-communities-suffer-more-from-extreme-heat-ucsd-study-finds/

    17“Widespread Race and Class Disparities in Surface Urban Heat Extremes Across the United States,” Susanne Amelie Benz and Jennifer Anne Burney, July 13, 2021, accessed June 28, 2022, https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021EF002016


    This page titled 8.1: Paired Samples is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Hannah Seidler-Wright.

    • Was this article helpful?