Skip to main content
Statistics LibreTexts

Simple Linear Regression (with one predictor)

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)


    \(X\) and \(Y\) are the predictor and response variables, respectively. Fit the model,
    \[ Y_i = \beta_0+\beta_1X_i+\epsilon_i, x = 1,2,...,n \]
    where \( \epsilon_1 ,..., \epsilon_n \) are uncorrelated, \( E(\epsilon_1)=0, VAR(\epsilon_1)=\sigma^2 \).


    Look at the scatter plot of \(Y\) (vertical axis) versus \(X\) (horizontal axis). Consider narrow vertical strips around the different values of \(X\):
    1. Means (measure of center) of the points falling in the vertical strips lie (approximately) on a straight line with slope \(\beta_1\) and intercept \(\beta_0\).
    2. Standard deviations (measure of spread) of the points falling in each vertical strip are (roughly) the same.

    Estimation of \(\beta_0 \) and \( \beta_1 \)

    We employ the method of least squares to estimate \(\beta_0\) and \(\beta_1\). This means, we minimize the sum of squared errors : \(Q(\beta_0,\beta_1) = \sum_{i=1}^n(Y_i-\beta_0-\beta_1X_i)^2\). This involves differentiating \(Q(\beta_0,\beta_1)\) with respect to the parameters \(\beta_0\) and \(\beta_1\) and setting the derivatives to zero. This gives us the normal equations: \[nb_0 + b_1\sum_{i=1}^nX_i = \sum_{i=1}^nY_i\] \[b_0\sum_{i=1}^nX_i+b_1\sum_{i=1}^nX_i^2 = \sum_{i=1}^nX_iY_i\] Solving these equations, we have: \[b_1=\frac{\sum_{i=1}^nX_iY_i-n\overline{XY}}{\sum_{i=1}^nX_i^2-n\overline{X}^2} = \frac{\sum_{i=1}^n(X_i-\overline{X})(Y_i-\overline{y})}{\sum_{i=1}^n(X_i-\overline{X})^2}, b_0 = \overline{Y}-b_1\overline{X}\] \(b_0\) and \(b_1\) are the estimates of \(\beta_0\) and \(\beta_1\), respectively, and are sometimes denoted as \(\widehat\beta_0\) and \(\widehat\beta_1\).


    The fitted regression line is given by the equation: \[\widehat{Y} = b_0 + b_1X\] and is used to predict the value of \(Y\) given a value of \(X\).


    These are the quantities \(e_i = Y_i - \widehat{Y}_i = Y_i - (b_0 + b_1X_i)\), where \(\widehat{Y}_i = b_0 + b_1X_i\). Note that \(\epsilon_i = Y_i - \beta_0 - \beta_1X_i\). This means that \(e_i\)'s estimate \(\epsilon_i\)'s. Some properties of the regression line and residuals are :
    1. \(\sum_{i}e_i = 0\).
    2. \(\sum_{i}e_i^2 \leq \sum_{i}(Y_i - u_0 - u_1X_i)^2\) for any \((u_0, u_1)\) (with equality when \((u_0, u_1)\) = \((b_0, b_1)\)).
    3. \(\sum_{i}Y_i = \sum_{i}\widehat{Y}_i\).
    4. \(\sum_{i}X_ie_i = 0\).
    5. \(\sum_{i}\widehat{Y}_ie_i = 0\).
    6. Regression line passes through the point \((\overline{X},\overline{Y})\)
    7. The slope \(b_1\) of the regression line can be expressed as \(b_1 = r_{XY}\frac{sy}{sx}\), where \(r_{XY}\) is the correlation coefficient between \(X\) and \(Y\) and \(s_X\) and \(s_Y\) are the standard deviations of \(X\) and \(Y\).
    Error sum of squares, deonted \(SSE\), is given by \[SSE = \sum_{i=1}^ne_i^2 = \sum_{i=1}^n(Y_i - \overline{Y})^2 - b_1^2\sum_{i=1}^n(X_i-\overline{X})^2.\]

    Estimation of \(\sigma^2\)

    It can be shown that \(E(SSE) = (n-2)\sigma^2.\) Therefore, \(\sigma^2\) is estimated by the mean squared error, i.e., \(MSE = \frac{SSE}{n-2}.\) Note also that this justifies the statement that the degree of freedom of the errors is \(n-2\) which is sample size \((n)\) minus the number of regression coefficients (\(\beta_0\) and \(\beta_1\)) being estimated.


    • Debashis Paul (UCD)
    • Scott Brunstein (UCD)

    This page titled Simple Linear Regression (with one predictor) is shared under a not declared license and was authored, remixed, and/or curated by Debashis Paul.

    • Was this article helpful?