Skip to main content
Statistics LibreTexts

Simple Linear Regression (with one predictor)


\(X\) and \(Y\) are the predictor and response variables, respectively. Fit the model,
\[ Y_i = \beta_0+\beta_1X_i+\epsilon_i,      x = 1,2,...,n \]       
where \(  \epsilon_1 ,..., \epsilon_n \)  are uncorrelated,  \( E(\epsilon_1)=0, VAR(\epsilon_1)=\sigma^2 \).


Look at the scatter plot of \(Y\) (vertical axis) versus \(X\) (horizontal axis). Consider narrow vertical strips around the different values of \(X\): 
  1. Means (measure of center) of the points falling in the vertical strips lie (approximately) on a straight line with slope \(\beta_1\) and intercept \(\beta_0\).
  2. Standard deviations (measure of spread) of the points falling in each vertical strip are (roughly) the same.

Estimation of \(\beta_0 \) and \( \beta_1 \)

We employ the method of least squares to estimate \(\beta_0\) and \(\beta_1\).
This means, we minimize the sum of squared errors : \(Q(\beta_0,\beta_1) = \sum_{i=1}^n(Y_i-\beta_0-\beta_1X_i)^2\).
This involves differentiating \(Q(\beta_0,\beta_1)\) with respect to the parameters \(\beta_0\) and \(\beta_1\) and setting the derivatives to zero. This gives us the normal equations:
\[nb_0 + b_1\sum_{i=1}^nX_i = \sum_{i=1}^nY_i\]
\[b_0\sum_{i=1}^nX_i+b_1\sum_{i=1}^nX_i^2 = \sum_{i=1}^nX_iY_i\]
Solving these equations, we have:
\[b_1=\frac{\sum_{i=1}^nX_iY_i-n\overline{XY}}{\sum_{i=1}^nX_i^2-n\overline{X}^2} = \frac{\sum_{i=1}^n(X_i-\overline{X})(Y_i-\overline{y})}{\sum_{i=1}^n(X_i-\overline{X})^2},                  b_0 = \overline{Y}-b_1\overline{X}\]
\(b_0\) and \(b_1\) are the estimates of \(\beta_0\) and \(\beta_1\), respectively, and are sometimes denoted as \(\widehat\beta_0\) and \(\widehat\beta_1\).


The fitted regression line is given by the equation:
\[\widehat{Y} = b_0 + b_1X\]
and is used to predict the value of \(Y\) given a value of \(X\).


These are the quantities \(e_i = Y_i - \widehat{Y}_i = Y_i - (b_0 + b_1X_i)\), where \(\widehat{Y}_i = b_0 + b_1X_i\). Note that \(\epsilon_i = Y_i - \beta_0 - \beta_1X_i\). This means that \(e_i\)'s estimate \(\epsilon_i\)'s. Some properties of the regression line and residuals are :
  1. \(\sum_{i}e_i = 0\).
  2. \(\sum_{i}e_i^2 \leq \sum_{i}(Y_i - u_0 - u_1X_i)^2\) for any \((u_0, u_1)\) (with equality when \((u_0, u_1)\) = \((b_0, b_1)\)).
  3. \(\sum_{i}Y_i = \sum_{i}\widehat{Y}_i\).
  4. \(\sum_{i}X_ie_i = 0\).
  5. \(\sum_{i}\widehat{Y}_ie_i = 0\).
  6. Regression line passes through the point \((\overline{X},\overline{Y})\)
  7. The slope \(b_1\) of the regression line can be expressed as \(b_1 = r_{XY}\frac{sy}{sx}\), where \(r_{XY}\) is the correlation coefficient between \(X\) and \(Y\) and \(s_X\) and \(s_Y\) are the standard deviations of \(X\) and \(Y\).
Error sum of squares, deonted \(SSE\), is given by
\[SSE = \sum_{i=1}^ne_i^2 = \sum_{i=1}^n(Y_i - \overline{Y})^2 - b_1^2\sum_{i=1}^n(X_i-\overline{X})^2.\]

Estimation of \(\sigma^2\) 

It can be shown that \(E(SSE) = (n-2)\sigma^2.\) Therefore, \(\sigma^2\) is estimated by the mean squared error, i.e., \(MSE = \frac{SSE}{n-2}.\) Note also that this justifies the statement that the degree of freedom of the errors is \(n-2\) which is  sample size \((n)\) minus the number of regression coefficients (\(\beta_0\) and \(\beta_1\)) being estimated.


  • Debashis Paul (UCD)
  • Scott Brunstein (UCD)