14.2: Hypothesis Testing in Regression
- Page ID
- 56669
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Regression, like all other analyses, will test a null hypothesis in our data. In regression, we are interested in predicting Y scores and explaining variance using a line, the slope of which is what allows us to get closer to our observed scores than the mean of Y can. Thus, our hypotheses concern the slope of the line, which is estimated in the prediction equation by b (the slope of a population, as opposed to b, which is the slope of a sample). Specifically, we want to test that the slope is not zero:
\(
\begin{aligned}
\qquad H_0&: \text{There is no explanatory relationship between our variables} \\
\qquad H_0&: \beta = 0 \\[2.5ex]
\qquad H_A&: \text{There is an explanatory relationship between our variables} \\
\qquad H_A&: \beta \neq 0 \\[2.5ex]
\end{aligned}
\)
A non-zero slope indicates that we can explain values in Y based on X and therefore predict future values of Y based on X. Our alternative hypotheses are analogous to those in correlation: positive relationships have values above zero, negative relationships have values below zero, and two-tailed tests are possible. Just like ANOVA, we will test the significance of this relationship using the F statistic calculated in our ANOVA table compared to a critical value from the F distribution table. Let’s take a look at an example and regression in action.
Example: Happiness and Well-being
Researchers are interested in explaining differences in how happy people are based on how healthy people are. They gather data on each of these variables from 18 people and fit a linear regression model to explain the variance. We will follow the four-step hypothesis-testing procedure to see if there is a relationship between these variables that is statistically significant.
Step 1: State the Hypotheses
The null hypothesis in regression states that there is no relationship between our variables. The alternative states that there is a relationship, but because our research description did not explicitly state a direction of the relationship, we will use a non-directional hypothesis.
\[
\begin{aligned}
H_0&: \text{There is no explanatory relationship between health and happiness} \\
H_0&: \beta = 0 \\[2.5ex]
H_A&: \text{There is an explanatory relationship between health and happiness} \\
H_A&: \beta \neq 0 \\[2.5ex]
\end{aligned}
\nonumber
\]
Step 2: Find the Critical Value
Because regression and ANOVA are the same analysis, our critical value for regression will come from the same place: the F distribution table, which uses two types of degrees of freedom. We saw in the ANOVA table that the degrees of freedom for our numerator—the Model line—is always 1 in simple linear regression, and that the denominator degrees of freedom—from the Error line—is N − 2. In this instance, we have 18 people, so our degrees of freedom for the denominator is 16. Going to our F table (a portion of which is shown in Table \(\PageIndex{1}\)), we find that the appropriate critical value for 1 and 16 degrees of freedom is F* = 4.49. (The complete F table can be found in section 16.3.)
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 12 | 15 | 20 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 161.4476 | 199.5000 | 215.7073 | 224.5832 | 230.1619 | 233.9860 | 236.7684 | 238.8827 | 240.5433 | 241.8817 | 243.9060 | 245.9499 | 248.0131 |
| 2 | 18.5128 | 19.0000 | 19.1643 | 19.2468 | 19.2964 | 19.3295 | 19.3532 | 19.3710 | 19.3848 | 19.3959 | 19.4125 | 19.4291 | 19.4458 |
| 3 | 10.1280 | 9.5521 | 9.2766 | 9.1172 | 9.0135 | 8.9406 | 8.8867 | 8.8452 | 8.8123 | 8.7855 | 8.7446 | 8.7029 | 8.6602 |
| 4 | 7.7086 | 6.9443 | 6.5914 | 6.3882 | 6.2561 | 6.1631 | 6.0942 | 6.0410 | 5.9988 | 5.9644 | 5.9117 | 5.8578 | 5.8025 |
| 5 | 6.6079 | 5.7861 | 5.4095 | 5.1922 | 5.0503 | 4.9503 | 4.8759 | 4.8183 | 4.7725 | 4.7351 | 4.6777 | 4.6188 | 4.5581 |
| 6 | 5.9874 | 5.1433 | 4.7571 | 4.5337 | 4.3874 | 4.2839 | 4.2067 | 4.1468 | 4.0990 | 4.0600 | 3.9999 | 3.9381 | 3.8742 |
| 7 | 5.5914 | 4.7374 | 4.3468 | 4.1203 | 3.9715 | 3.8660 | 3.7870 | 3.7257 | 3.6767 | 3.6365 | 3.5747 | 3.5107 | 3.4445 |
| 8 | 5.3177 | 4.4590 | 4.0662 | 3.8379 | 3.6875 | 3.5806 | 3.5005 | 3.4381 | 3.3881 | 3.3472 | 3.2839 | 3.2184 | 3.1503 |
| 9 | 5.1174 | 4.2565 | 3.8625 | 3.6331 | 3.4817 | 3.3738 | 3.2927 | 3.2296 | 3.1789 | 3.1373 | 3.0729 | 3.0061 | 2.9365 |
| 10 | 4.9646 | 4.1028 | 3.7083 | 3.4780 | 3.3258 | 3.2172 | 3.1355 | 3.0717 | 3.0204 | 2.9782 | 2.9130 | 2.8450 | 2.7740 |
| 11 | 4.8443 | 3.9823 | 3.5874 | 3.3567 | 3.2039 | 3.0946 | 3.0123 | 2.9480 | 2.8962 | 2.8536 | 2.7876 | 2.7186 | 2.6464 |
| 12 | 4.7472 | 3.8853 | 3.4903 | 3.2592 | 3.1059 | 2.9961 | 2.9134 | 2.8486 | 2.7964 | 2.7534 | 2.6866 | 2.6169 | 2.5436 |
| 13 | 4.6672 | 3.8056 | 3.4105 | 3.1791 | 3.0254 | 2.9153 | 2.8321 | 2.7669 | 2.7144 | 2.6710 | 2.6037 | 2.5331 | 2.4589 |
| 14 | 4.6001 | 3.7389 | 3.3439 | 3.1122 | 2.9582 | 2.8477 | 2.7642 | 2.6987 | 2.6458 | 2.6022 | 2.5342 | 2.4630 | 2.3879 |
| 15 | 4.5431 | 3.6823 | 3.2874 | 3.0556 | 2.9013 | 2.7905 | 2.7066 | 2.6408 | 2.5876 | 2.5437 | 2.4753 | 2.4034 | 2.3275 |
| 16 | 4.4940 | 3.6337 | 3.2389 | 3.0069 | 2.8524 | 2.7413 | 2.6572 | 2.5911 | 2.5377 | 2.4935 | 2.4247 | 2.3522 | 2.2756 |
Step 3: Calculate the Test Statistic and Effect Size
The process of calculating the test statistic for regression first involves computing the parameter estimates for the line of best fit. To do this, we first calculate the means, standard deviations, and sum of products for our X and Y variables, as shown in Table \(\PageIndex{2}\).
| X | (X - MX) | (X - MX)2 | Y | (Y - MY) | (Y - MY)2 | (X - MX)(Y - MY) |
|---|---|---|---|---|---|---|
| 17.65 | -2.13 | 4.53 | 10.36 | −7.10 | 50.37 | 15.10 |
| 16.99 | -2.79 | 7.80 | 16.38 | −1.08 | 1.16 | 3.01 |
| 18.30 | -1.48 | 2.18 | 15.23 | −2.23 | 4.97 | 3.29 |
| 18.28 | -1.50 | 2.25 | 14.26 | −3.19 | 10.18 | 4.79 |
| 21.89 | 2.11 | 4.47 | 17.71 | 0.26 | 0.07 | 0.55 |
| 22.61 | 2.83 | 8.01 | 16.47 | −0.98 | 0.97 | −2.79 |
| 17.42 | -2.36 | 5.57 | 16.89 | −0.56 | 0.32 | 1.33 |
| 20.35 | 0.57 | 0.32 | 18.74 | 1.29 | 1.66 | 0.73 |
| 18.89 | -0.89 | 0.79 | 21.96 | 4.50 | 20.26 | −4.00 |
| 18.63 | -1.15 | 1.32 | 17.57 | 0.11 | 0.01 | −0.13 |
| 19.67 | -0.11 | 0.01 | 18.12 | 0.66 | 0.44 | −0.08 |
| 18.39 | -1.39 | 1.94 | 12.08 | −5.37 | 28.87 | 7.48 |
| 22.48 | 2.71 | 7.32 | 17.11 | −0.34 | 0.12 | −0.93 |
| 23.25 | 3.47 | 12.07 | 21.66 | 4.21 | 17.73 | 14.63 |
| 19.91 | 0.13 | 0.02 | 17.86 | 0.40 | 0.16 | 0.05 |
| 18.21 | -1.57 | 2.45 | 18.49 | 1.03 | 1.07 | −1.62 |
| 23.65 | 3.87 | 14.99 | 22.13 | 4.67 | 21.82 | 18.08 |
| 19.45 | -0.33 | 0.11 | 21.17 | 3.72 | 13.82 | −1.22 |
| ΣX=356.02 | Σ(X - MX)=0.00 | Σ(X - MX)2=76.14 | ΣY=314.18 | Σ(Y - MY)=0.00 | Σ(Y - MY)2=173.99 | Σ(X - MX)(Y - MY)=58.29 |
From the raw data in our X and Y columns, we find that the means are \(M_X=19.78\) and \(M_Y=17.45\). The deviation scores for each variable sum to zero, so all is well there. The sums of squares for X and Y ultimately lead us to standard deviations of \(s_X=2.12\) and \(s_Y=3.20\). Finally, our sum of products is 58.29, which gives us a covariance of \(\text{cov}_{XY}=3.43\), so we know our relationship will be positive. This is all the information we need for our equations for the line of best fit.
First, we must calculate the slope of the line:
\[\Large b = \frac{SP}{SS_X} = \frac{58.29}{76.14} = 0.77 \nonumber \]
This means that as X changes by 1 unit, Y will change by 0.77. In terms of our problem, as health increases by 1, happiness goes up by 0.77, which is a positive relationship. Next, we use the slope, along with the means of each variable, to compute the intercept:
\[
\Large
\begin{aligned}
a=& M_Y - bM_X \\
=& 17.45 - (0.77)(19.78) \\
=& 17.45-15.03 \\
=& 2.42
\end{aligned}
\nonumber
\]
For this particular problem (and most regressions), the intercept is not an important or interpretable value, so we will not read into it further. Now that we have all of our parameters estimated, we can give the full equation for our line of best fit:
\[\Large \hat{Y}=2.42+0.77X \nonumber \]
We can plot this relationship in a scatter plot and overlay our line onto it, as shown in Figure \(\PageIndex{1}\).
Completing the ANOVA Summary Table for Regression
- We can use the line equation to find predicted values for each observation and use them to calculate our sums of squares model and error, but this is tedious to do by hand, so we will let the computer software do the heavy lifting in that column of our ANOVA table, and we get \(SS_\text{Model}=44.62\) and \(SS_\text{Error}=129.37\).
- Now that we have these, we can fill in the rest of the ANOVA table. We already found our degrees of freedom in Step 2.
- Our total line for SS and df is always the sum of the other two lines, giving us \(SS_\text{Total}=173.99\) and \(df_\text{Total}=17\)
- Our mean squares column is only calculated for the model and error lines and is always our SS divided by our df, which are \(\frac{SS_M}{df_M}=\frac{44.62}{1}=44.62\) and \(\frac{SS_E}{df_E}=\frac{129.37}{16}=8.09\).
- Finally, our F statistic is the ratio of the mean squares, \(\frac{MS_M}{MS_E}=\frac{44.62}{8.09}=5.52\). This gives us an obtained F statistic of 5.52, which we will use to test our hypothesis.
- The completed ANOVA table is shown in Table \(\PageIndex{3}\).
| Source | SS | df | MS | F |
|---|---|---|---|---|
| Model | 44.62 | 1 | 44.62 | 5.52 |
| Error | 129.37 | 16 | 8.09 | |
| Total | 173.99 | 17 |
Effect Size in Regression
We know that, because our statistic is significant, we should calculate an effect size. In regression, our effect size is variance explained, just like it was in ANOVA. Instead of using \(\eta^2\) to represent this, we instead use R2, as we saw in correlation—yet more evidence that all of these are the same analysis. (Note that in regression analysis, R2 is typically capitalized, although for simple linear regression it represents the same value as r2 we used in correlation.) Variance explained is still the ratio of SSM to SST:
\[\Large R^2 = \frac{SS_M}{SS_T} = \frac{44.62}{173.99} = .26 \nonumber \]
We are explaining 26% of the variance in happiness based on health, which is a large effect size. (R2 uses the same effect size cutoffs as \(\eta^2\).)
Step 4: Make the Decision
We now have everything we need to make our final decision. Our obtained test statistic was F = 5.52 and our critical value was F* = 4.49. Since our obtained test statistic is greater than our critical value, we can reject the null hypothesis.
Reject \(H_0\). Based on our sample of 18 people, we can predict levels of happiness based on how healthy someone is, and the effect size was large, F(1, 16) = 5.52, p < .05, R2 < .26.
Figure \(\PageIndex{2}\) shows the output from JASP for this example.
Accuracy in Prediction
We found a large, statistically significant relationship between our variables, which is what we hoped for. However, if we want to use our estimated line of best fit for future prediction, we will also want to know how precise or accurate our predicted values are. What we want to know is the average distance from our predictions to our actual observed values, or the average size of the residual (\(Y-\hat{Y}\)). The average size of the residual is known by a specific name: the standard error of the estimate (\(s_{(Y-\hat{Y})}\)), which is given by the formula
\[\Large s_{(Y-\hat{Y})} = \sqrt{\frac{\sum{(Y-\hat{Y})^2}}{N-2}} \nonumber \]
This formula is almost identical to our standard deviation formula, and it follows the same logic. We square our residuals, add them up, and then divide by the degrees of freedom. Although this sounds like a long process, we already have the sum of the squared residuals in our ANOVA table! In fact, the value under the square root sign is just the SSE divided by the dfE, which is called the mean squared error, or MSE:
\[\Large s_{(Y-\hat{Y})} = \sqrt{\frac{\sum{(Y-\hat{Y})^2}}{N-2}} = \sqrt{MS_E} \nonumber \]
For our example:
\[\Large s_{(Y-\hat{Y})} = \sqrt{\frac{129.37}{16}} = \sqrt{8.09} = 2.84 \nonumber \]
So, on average, our predictions are just under 3 points away from our actual values. There are no specific cutoffs or guidelines for how big our standard error of the estimate can or should be; it is highly dependent on both our sample size and the scale of our original Y variable, so expert judgment should be used. In this case, the estimate is not that far off and can be considered reasonably precise.
Question \(\PageIndex{1}\)
Question \(\PageIndex{2}\)
Question \(\PageIndex{3}\)
Question \(\PageIndex{4}\)


