# Simple Linear Regression (with one predictor)

### Model

\(X\) and \(Y\) are the predictor and response variables, respectively. Fit the model,

\[ Y_i = \beta_0+\beta_1X_i+\epsilon_i, x = 1,2,...,n \]

where \( \epsilon_1 ,..., \epsilon_n \) are

**uncorrelated**, \( E(\epsilon_1)=0, VAR(\epsilon_1)=\sigma^2 \).### Interpretation

Look at the scatter plot of \(Y\) (vertical axis) versus \(X\) (horizontal axis). Consider narrow vertical strips around the different values of \(X\):

- Means (measure of center) of the points falling in the vertical strips lie (approximately) on a straight line with slope \(\beta_1\) and intercept \(\beta_0\).
- Standard deviations (measure of spread) of the points falling in each vertical strip are (roughly) the same.

### Estimation of \(\beta_0 \) and \( \beta_1 \)

We employ the method of least squares to estimate \(\beta_0\) and \(\beta_1\).

This means, we minimize the sum of squared errors : \(Q(\beta_0,\beta_1) = \sum_{i=1}^n(Y_i-\beta_0-\beta_1X_i)^2\).

This involves differentiating \(Q(\beta_0,\beta_1)\) with respect to the

*parameters*\(\beta_0\) and \(\beta_1\) and setting the derivatives to zero. This gives us the**normal equations:**\[nb_0 + b_1\sum_{i=1}^nX_i = \sum_{i=1}^nY_i\]

\[b_0\sum_{i=1}^nX_i+b_1\sum_{i=1}^nX_i^2 = \sum_{i=1}^nX_iY_i\]

Solving these equations, we have:

\[b_1=\frac{\sum_{i=1}^nX_iY_i-n\overline{XY}}{\sum_{i=1}^nX_i^2-n\overline{X}^2} = \frac{\sum_{i=1}^n(X_i-\overline{X})(Y_i-\overline{y})}{\sum_{i=1}^n(X_i-\overline{X})^2}, b_0 = \overline{Y}-b_1\overline{X}\]

\(b_0\) and \(b_1\) are the

*estimates*of \(\beta_0\) and \(\beta_1\), respectively, and are sometimes denoted as \(\widehat\beta_0\) and \(\widehat\beta_1\).### Prediction

The

**fitted regression line**is given by the equation:\[\widehat{Y} = b_0 + b_1X\]

and is used to predict the value of \(Y\) given a value of \(X\).

### Residuals

These are the quantities \(e_i = Y_i - \widehat{Y}_i = Y_i - (b_0 + b_1X_i)\), where \(\widehat{Y}_i = b_0 + b_1X_i\). Note that \(\epsilon_i = Y_i - \beta_0 - \beta_1X_i\). This means that \(e_i\)'s estimate \(\epsilon_i\)'s. Some properties of the regression line and residuals are :

- \(\sum_{i}e_i = 0\).
- \(\sum_{i}e_i^2 \leq \sum_{i}(Y_i - u_0 - u_1X_i)^2\) for any \((u_0, u_1)\) (with equality when \((u_0, u_1)\) = \((b_0, b_1)\)).
- \(\sum_{i}Y_i = \sum_{i}\widehat{Y}_i\).
- \(\sum_{i}X_ie_i = 0\).
- \(\sum_{i}\widehat{Y}_ie_i = 0\).
- Regression line passes through the point \((\overline{X},\overline{Y})\)
- The slope \(b_1\) of the regression line can be expressed as \(b_1 = r_{XY}\frac{sy}{sx}\), where \(r_{XY}\) is the correlation coefficient between \(X\) and \(Y\) and \(s_X\) and \(s_Y\) are the standard deviations of \(X\) and \(Y\).

**Error sum of squares**, deonted \(SSE\), is given by

\[SSE = \sum_{i=1}^ne_i^2 = \sum_{i=1}^n(Y_i - \overline{Y})^2 - b_1^2\sum_{i=1}^n(X_i-\overline{X})^2.\]

### Estimation of \(\sigma^2\)

It can be shown that \(E(SSE) = (n-2)\sigma^2.\) Therefore, \(\sigma^2\) is estimated by the

**mean squared error**, i.e., \(MSE = \frac{SSE}{n-2}.\) Note also that this justifies the statement that the**degree of freedom**of the errors is \(n-2\) which is sample size \((n)\) minus the number of regression coefficients (\(\beta_0\) and \(\beta_1\)) being estimated.### Contributors

- Debashis Paul (UCD)
- Scott Brunstein (UCD)