10.2: Correlation Coefficient
- Page ID
- 58307
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Define the correlation coefficient (r) and explain its purpose.
- Describe how the correlation coefficient measures the strength and direction of a linear relationship between two variables.
- Interpret the value of r on a scale from -1 to 1.
- Identify that values of r near -1 or 1 indicate stronger linear relationships.
Correlation in terms of two variables measures how much they move together. If one variable increases when the other does, they have a positive correlation. If one increases while the other decreases, they have a negative correlation. If they don’t affect each other, there’s no correlation. The correlation coefficient will be computed to measure the strength and direction of the correlation between two independent variables \(x\) and \(y\).
The correlation coefficient, denoted as \(r\), measures the strength and direction of the linear relationship between two variables. Its purpose is to provide a numerical value that quantifies how closely the variables \(x\) and \(y\) are related. The range of \(r\) is from \(-1\) to \(1\).
Correlation and Scatter Plots
When the correlation coefficient \(r\) is near \(-1\), it indicates a strong negative linear relationship. As the x-values on the horizontal axis increase, the y-values on the vertical axis will decrease. The closer \(r\) is to \(-1\), the stronger the linear relationship. The shape of the scatter plot will appear to be linear. An example of a scatter plot with \(r\) close to \(-1\) is presented below.
When the correlation coefficient \(r\) is near \(0\), it indicates there isn't a linear relationship between \(x\) and \(y\). The scatter plot will have an amorphous shape when \(r\) is close to \(0\) as presented in the image below.
When the correlation coefficient \(r\) is near \(1\), it indicates a strong positive linear relationship. As the x-values on the horizontal axis increase, the y-values on the vertical axis will also increase. The closer \(r\) is to \(1\) the stronger the linear relationship. The shape of the scatter plot will appear to be linear. An example of a scatter plot with \(r\) close to \(1\) is presented below.
Correlation Coefficient Formula
\(r = \dfrac{n(\sum xy)-(\sum x)(\sum y)}{\sqrt{[n(\sum x^2)-(\sum x)^2][n(\sum y^2)-(\sum y)^2]}}\)
Where,
- \(n\) = total number of pairs.
- \(\sum x\) = sum of all x values.
- \(\sum y\) = sum of all y values.
- \(\sum x^2\) = sum of all the squares of the x values.
- \(\sum y^2\) = sum of all the squares of the y values.
- \(\sum xy\) = sum of all the products of the corresponding x and y values.
Professor Martinez is conducting a study to understand the relationship between the number of hours students study per week and their performance on the midterm exam in Math 400, an advanced calculus course at the university. She collects data from 8 randomly selected students in her class. The exam is out of 100 points and time is measured in hours per week. Compute the correlation coefficient for the data set below. Use the formula and round the final answer to three decimal places.
| x: Hours Studied Per Week | y: Midterm Exam Score (out of 100 points) |
|---|---|
| 10 | 51 |
| 10 | 53 |
| 12 | 64 |
| 13 | 68 |
| 14 | 71 |
| 15 | 79 |
| 16 | 84 |
| 20 | 92 |
Table \(\PageIndex{1}\): Hours Studied Per Week and Midterm Exam Score
Solution
- Add three columns to the table for \(xy\), \(x^2\), and \(y^2\). Fill in the columns by performing the required computations.
| \(x\) | \(y\) | \(xy\) | \(x^2\) | \(y^2\) |
|---|---|---|---|---|
| 10 | 51 | 510 | 100 | 2601 |
| 10 | 53 | 530 | 100 | 2809 |
| 12 | 64 | 768 | 144 | 4096 |
| 13 | 68 | 884 | 169 | 4624 |
| 14 | 71 | 994 | 196 | 5041 |
| 15 | 79 | 1185 | 225 | 6241 |
| 16 | 84 | 1344 | 256 | 7056 |
| 20 | 92 | 1840 | 400 | 8464 |
Table \(\PageIndex{2}\): Added Columns Needed for Computation of the Correlation Coefficient.
- Find the sum of each column.
- \(n\) = 8
- \(\sum x\) = 110
- \(\sum y\) = 562
- \(\sum x^2\) = 1590
- \(\sum y^2\) = 40932
- \(\sum xy\) = 8055
- Plug the information into the formula and compute using the order of operations.
\(r = \dfrac{n(\sum xy)-(\sum x)(\sum y)}{\sqrt{[n(\sum x^2)-(\sum x)^2][n(\sum y^2)-(\sum y)^2]}}\)
\(r = \dfrac{8(8055)-(110)(562)}{\sqrt{[8(1590)-(110)^2][8(40932)-(562)^2]}}\)
\(r = \dfrac{64440-61820}{\sqrt{[620][11612]}}\)
\(r = \dfrac{2620}{\sqrt{719944}}\)
\(r = \dfrac{2620}{2683.17722}\)
\(r \approx 0.976\)
Correlation Coefficient and Technology
Using technology such as a scientific calculator, graphing calculator, Microsoft Excel, or other computational tools is a more efficient way to calculate the correlation coefficient. These methods minimize the risk of errors during calculations and save time by eliminating the need for extensive intermediate computations. Below is an example of how to compute the correlation coefficient using the TI-84+ calculator. All future calculations will be done using the TI-84+ calculator.
Using the data from example 1, compute the correlation coefficient \(r\) using a TI-84+ calculator.
Solution
To turn on Stat Wizard on a TI-84+, follow these steps:
- Press the [MODE] on your calculator.
- Scroll down using the arrow keys until you find [STAT WIZARDS].
- Highlight [ON] using the right arrow key.
- Press [ENTER] to confirm your selection.
- Press the [STAT] button, make sure that [EDIT] and [1:Edit] are selected, then press [ENTER].
- Enter the x-values in List 1 [\(L_1\)] and the y-values in List 2 [\(L_2\)].
- Press the [STAT] button again, use the right arrow to select [CALC], use the down arrow to select [8: LinReg(a+bx)], and then press [ENTER].
- Make sure that Xlist has \(L_1\) and the Ylist has \(L_2\). Use the down arrow to select [Calculate] and press [ENTER].
- On the output page, \(r\) will be on the last line. After rounding to three places values it is \(r = 0.976.\)
Author
"10.2: Correlation Coefficient" by Alfie Swan is licensed under CC BY 4.0
Exercises
- Compute the correlation coefficient for the data set below. Use the formula and round the final answer to three decimal places.
| X: Hours of Sleep on the Night Before the Exam | Y: Points (out of 100) Earned on the Exam |
|---|---|
| 8 | 75 |
| 6 | 86 |
| 3 | 72 |
| 4 | 65 |
| 2 | 68 |
| 0 | 50 |
| 12 | 90 |
| 7 | 98 |
| 10 | 84 |
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- Compute the correlation coefficient for the data set below. Use a graphing/scientific calculator and round the final answer to three decimal places.
| Years (Time) | Car Value ($) |
|---|---|
| 0 | 30,000 |
| 1 | 25,000 |
| 2 | 21,000 |
| 3 | 18,000 |
| 4 | 15,000 |
| 5 | 13,000 |
| 6 | 11,000 |
| 7 | 9,000 |
Figure \(\PageIndex{4}\): Years and Car Value in $
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- Answers
-
If you are an instructor and want the solutions to all the exercise questions for each section, please email Toros Berberyan.




