Skip to main content
Statistics LibreTexts

9.2: Quantifying Direction and Strength

  • Page ID
    49067
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    In the last lesson, we explored the direction and strength of relationships between two quantitative variables. Now we will begin to explore how to model these relationships. When two variables are related, we say that they correlate, and that there is correlation between them. Some relationships between the variables in scatterplots can be summarized well by a line. We call such relationships linear. Other relationships are better summarized by a curve rather than a line. We call these nonlinear or curvilinear. Recall that the direction of a relationship can be either positive or negative. Lines and curves can often be fit to the data in a scatterplot. A small amount of deviation away from some line or curve means that the explanatory variable is a good predictor for values of the response variable and we say the relationship is strong. A large amount of deviation from the line or curve indicates that the relationship is weak.

    Linear Correlation

    One measure of the linear correlation between two variables is called the linear correlation coefficient. This correlation coefficient is represented with the letter \(r\). We will use desmos to calculate the value of \(r\) for a given scatterplot. Given below is a dataset accompanied by its corresponding scatterplot.

    x1 y1
    1 1.4
    3 3.2
    5 4
    7 6.9
    9 9
    10 10.2
    14 13.8
    15 15.7
    19 19
    20 19.5

    AD_4nXcHgGOMborNtybYqhWRj0oGHtnnGdbJ7TkQreQzuuj6f9uRZ1SwbbB_1-7W_dvRqN0etgrFNhk-vB5-9FwR2cNChbVAZGJNAOf1WLoExfqHxOLPMqwGme3wopOghkxui4BJmb56XJFdty1o6QHJOGnSyHIFkeyi1XJeTDlU718V25snr3PRQ

    Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

    To find the linear correlation coefficient,

    1. Go to https://www.desmos.com/calculator.
    2. Copy and paste the data set above into line 1, or click the plus icon in the top left corner of the calculator, select table, and enter the values into the table.
    3. To find \(r\), type \(\operatorname{corr}(x 1, y 1)\) into line 2.
      1. For the data above, \(r=\operatorname{corr}(x 1, y 1) \approx 0.997\)
         
    1. Find the linear correlation coefficient for the following two datasets. The corresponding scatterplots are provided. Round \(r\) to three decimal places.
      1. \(r=\operatorname{corr}(x 2, y 2) \approx\underline{\ \ \ \ \ \ \ \ \ \ }\)
        x2 y2
        1 19
        3 19
        5 16
        7 13.8
        9 10
        10 9
        14 6.9
        15 4
        19 3.2
        20 1.4


        AD_4nXcQkw1JK_BTp3PKFgu3BHs7mR66Z8DW-N0SPsyCCkr9xdmq4hVXF0JlmaFzYxOjNHFWV3zqYyOacqBQumfAilpmmGO_L5b0O8PoNEF13PIs_d_k4LOG1COhW5ijLmBgcjL0-SwKHANvsPAVqwujOV9Tr6LRkeyi1XJeTDlU718V25snr3PRQ

        Images are created with the graphing calculator, used with permission from Desmos Studio PBC.










         

      2. \(r=\operatorname{corr}(x 3, y 3) \approx\underline{\ \ \ \ \ \ \ \ \ \ \ }\)
        x3 y3
        1 2
        3 15
        5 5
        7 13.8
        9 8
        14 4
        14 12
        15 3
        19 18
        20 1.4
        AD_4nXf031FSrzveKoFqqacPSdnfl_VzVGVLUUdJ7GuL_ROpUtJ85mu3QsD9sip2YZjydvlDYfwQmYz5YHJk0rZZYxtWx81_mj5woiINs6UHygdJ6cIFbgsty-bfKnk9fGRBN8USPTGBYweLaLhTcTPrwg6ZUurckeyi1XJeTDlU718V25snr3PRQ

        Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

    2. What do you think the linear correlation coefficient (r) measures?

       

       

       

    3. What is the largest possible value of r? What is the smallest possible value for r?

       

       

       

       

       

       

    4. Consider the following scatterplots:
       

      AD_4nXeyVxrj8FYMl1xQXFL6fsyFZKSWVmtx2Ib3u7TAJW5cDFj4e5aaK8E1d96d2YeYefoh44sS3C4w4pXFDq7aWYXFfaGM_uZUt1lD-pzf53C2OEFBkIxyaR8B2GTmrVrfzplafzVflv6jg7vllEsm8jo7RVkkeyi1XJeTDlU718V25snr3PRQ AD_4nXfIQjYw44KiXVJfwfojfYHS95KLLpvzDDmU6glSes6L_U1wucf0Pywgv5z6tDxtjdG6ztSuZ3rrcE0xrZ2bVRdIaYcSflURTr43XDn-4t4CeeF50-ao5HMj4fE3WIYfMlwQfNFsJMwZl2oGVhZ8WL-a_Ygkeyi1XJeTDlU718V25snr3PRQ AD_4nXe379TxvyiaxxBMwWXWWd_T91H64wCg30LZGeJ0WLFZQEiBhgLhtDJ-w63PX4H_HsuQ_z3VKb-Lt9okMadS3jHtljSzW9gwTh6G8C59ujtk21nzbN5LbHQt8RpqX-NatGmUIdjqnNsaii4s7q1JH0rE_wnGkeyi1XJeTDlU718V25snr3PRQ

      Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

       

      1. Decide which two of these scatterplots have an r-value close to 0 without calculating the r value. Explain why you think this. 

         

         

         

         

         

         

         

         

      2. Do both scatterplots with an r-value close to 0 have a weak relationship? If not, explain why.

         

         

         

         

         

         

         

         

         

         

         

         

         

       
    5. Consider the following scatterplots:

    AD_4nXefznjZvRzYygBQquUvj6rUykEJP9fpt7zgohAwHybD63bfurwSCSypaeVste0OJXMMVaRWYF-KrVBrAANEggCyajMC8MftZxOl4gsOUOjTskbjouG4-06-uixrbGbeYu8A97JhYZhz6i4yZYe01KgPXcgkeyi1XJeTDlU718V25snr3PRQ AD_4nXdkL3uvyNSeBa39uvxJzxfqU0GCusDoG2Eq84K4XJ4tuVIAzX-BBMnCoRpmCObSuk5SzCFNH2ctydJww5SoNfw9M0zpWdKRlvjwvdgZptZzNvjtkG5vOYfrEqUVflPvPhfqRljRcfV9zwDBj4yn7OtGdzQkeyi1XJeTDlU718V25snr3PRQ AD_4nXdpjv6ft0UVSQnT1dqWtg6WBo0U_molgzZmCF4vlsU2X5RGsN-gfNXXvnY45RZ9V2P2Yw5EqnmvlgdRI8yfUu0cH6NX4TP0Lvya5mvhoKWjxY30-rYMknm3NjZe3TZQaTpNmQeElHG6fAPZJD7LCxQcN8ABkeyi1XJeTDlU718V25snr3PRQ

    Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

     

    1. Determine which two scatterplots above have an r-value close to 0.92 without calculating the r-value. Explain why you think this. 

       

       

       

       

       

       

       

       

    2. Can variables have a nonlinear relationship when r is close to 1? Explain why you think this. 

       

       

       

       

       

       

       

       

    Characteristics of the Linear Correlation Coefficient

    • r is between -1 and 1 (inclusive). A scatterplot with an r-value close to -1 or 1 has a strong linear association. Scatterplots with r-values close to 0 have weak linear associations.
    • The linear correlation coefficient, r, is a number that describes the direction and strength of a scatterplot with a linear association.
    • The sign of r (positive or negative) indicates the direction of a scatterplot with a linear association.

    The r-value only gives us reliable information about direction and strength of linear associations and should not be used to quantify patterns of a nonlinear association.

    Correlation and Causation

    1. The scatterplot below shows the rate of individuals using the internet (% of population) and the life expectancy at birth for 15 countries in 202021.

      AD_4nXf8hMUPeT1ImOCwatTI5whUWaa9COLRY82xNXL7pbfkE-BjQKeYQHbiyxaRxKls0SPhRvl2MRHoOGhvMyt76wst0PiZfD54PaI7OiRE7DZXrasroJkWW-utnoJED-nPLD0o8fcL41PeiaDepmCDXztUkWokeyi1XJeTDlU718V25snr3PRQ

      Images are created with the graphing calculator, used with permission from Desmos Studio PBC.

       

      1. In this group of 15 countries, does an increase in internet usage tend to be associated with an increase or decrease in life expectancy? 

         

         

         

         

         

         

      2. The pattern in the scatterplot above indicates a fairly strong, positive, nonlinear relationship. Based on this observation, someone might suggest that one way to improve a country’s life expectancy would be to get more people online. Is this a reasonable conclusion? Explain. 

         

         

         

         

         

         

      3. Is the relationship between internet usage and life expectancy one of cause and effect? Consider the type of study that was performed to collect the data.

         

         

         

         



        In an observational study where there is a relationship between two variables, we cannot conclude that the relationship is one of cause and effect. Such conclusions are determined in experimental studies. Sometimes, there is a third variable, possibly unconsidered, that drives changes for both the explanatory and response variables. This additional variable is called a lurking variable.
         
      4. Suggest a possible lurking variable that might explain the relationship in internet usage and life expectancy in the data above.

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         

         





     

    Reference

    21 World Bank Group. World Bank. (n.d.). Accessed July 13, 2022, from https://www.worldbank.org/en/home


    This page titled 9.2: Quantifying Direction and Strength is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Hannah Seidler-Wright.

    • Was this article helpful?