Skip to main content
Library homepage
 
Statistics LibreTexts

3.3.6: \( r^2\), The Correlation of Determination

  • Page ID
    28710
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The Regression ANOVA hypothesis test can be used to determine if there is a significant correlation between the independent variable (\(X\)) and the dependent variable (\(Y\)). We now want to investigate the strength of correlation.  

    In the earlier chapter on descriptive statistics, we introduced the correlation coefficient (\(r\)), a value between ‐1 and 1. Values of \(r\) close to 0 meant there was little correlation between the variables, while values closer to 1 or ‐1 represented stronger correlations.  

    In practice, most statisticians and researchers prefer to use \(r^{2}\), the coefficient of determination as a measure of strength as it represents the proportion or percentage of the variability of \(Y\) that is explained by the variability of \(X\). 87

    \(r^2\)

    \(r^{2}=\dfrac{S S_{\text{regression}}{S S_{\text {Total }}} \dquad 0 \% \leq r^{2} \leq 100 \%\)

    \(r^{2}\) represents the percentage of the variability of \(Y\) that is explained by the variability of \(X\).

    clipboard_eb6c5753f8712e0ce8e70d77a6973ede5.png

    We can also calculate the correlation coefficient (\(r\)) by taking the appropriate square root of \(r^{2}\), depending on whether the estimate of the slope (\(b_1\)) is positive or negative:

    If \(b_{1}>0, r=\sqrt{r^{2}}\)

    If \(b_{1}<0, r=-\sqrt{r^{2}}\)

    Example: Rainfall and sales of sunglasses

    For the rainfall data, the coefficient of determination is:

    \(r^{2}=\dfrac{341.422}{380}=89.85 \%\)

    89.85% of the variability of sales of sunglasses is explained by rainfall.

    We can calculate the correlation coefficient (\(r\)) by taking the appropriate square root of \(r^{2}\):

    \(r=-\sqrt{.8985}=-0.9479\)

    Here we take the negative square root since the slope of the regression line is negative. This shows that there is a strong, negative correlation between sales of sunglasses and rainfall.

     


    3.3.6: \( r^2\), The Correlation of Determination is shared under a CC BY-SA license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?