Skip to main content
Statistics LibreTexts

Parameter Estimation in Simple Linear Regression

Parameter estimation in simple linear regression

Model: \(X\) and \(Y\) are the predictor and response variables, respectively. Fit the model,

\[Y_i = \beta_0 + \beta_1X_i + \epsilon_i,    i = 1,...,n \tag{1}\]

  where \(\epsilon_1,...,\epsilon_n\) are uncorrelated, E\((\epsilon_i)\) = 0, Var\((\epsilon_i) = \sigma^2\) for all \(i\).

Estimates of the parameters: We have the following estimates for \(\beta_0, \beta_1,\) and \(\sigma^2\), respectively.

\[b_0 = \overline{Y} - b_1 \overline{x},\]

\[b_1 = \dfrac{\sum_{i=1}^n(X_i - \overline{X})(Y_i - \overline{Y})}{\sum_{i=1}^n(X_i - \overline{X})^2},\]

\[\widehat{\sigma}^2 = MSE = \frac{SSE}{n-2} \tag{2},\]


\[SSE = \sum_{i=1}^n(Y_i - b_0 - b_1X_i)^2 = \sum_{i=1}^n(Y_i - \overline{Y})^2 - b_1^2\sum_{i=1}^n(X_i - \overline{X})^2 . \]

  • Prediction: The predicted value of \(Y\), given \(X = X_h\) is \(\widehat{Y}_h = b_0 + b_1X_h = \overline{Y} + b_1(X_h - \overline{X})\).
  • Expected values and variances: Under the assumptions of the simple linear regression model, we have \(E(b_0) = \beta_0,  E(b_1) = \beta_1\) and \(E(\widehat{\sigma}^2) = E(MSE) = \sigma^2\). In other words, the estimators \(b_0, b_1, \widehat{\sigma}^2\) are unbiased. Also, \(E(\widehat{Y}_h|X_h) = \beta_0 + \beta_1X_h\).

 Assuming that \(X_1,...,X_n\) are non-random, the variances of \(b_0\) and \(b_1\) are given by:

\[\sigma^2(b_0) = \sigma^2\left [ \frac{1}{n} + \frac{\overline{X}^2}{\sum_i(X_i-\overline{X})^2} \right ], \]


\[ \sigma^2(b_1) =  \frac{\sigma^2}{  \sum_{i=1}^n (X_i - \overline{X})^2  } \tag{3}. \]

Replacing \(\sigma^2\) by \(MSE\), we obtain the estimates of the variances of \(\beta_0\) and \(beta_1\), and these are denoted by

\[s^2(b_0) = MSE \left [ \frac{1}{n} + \frac{\overline{X}^2}{\sum_i(X_i - \overline{X})^2} \right ],\]


\[s^2(b_1) = \frac{MSE}{\sum_{i=1}^n(X_i - \overline{X})^2}, \tag{4}\]

respectively. Thus, \(s(b_0)\) and \(s(b_1)\) are the estimated standard errors of the estimators of \(\beta_0\) and \(\beta_1\), respectively.

Similarly, the variance and its estimate of \(\widehat{Y}_h\) are

\[\sigma^2(\widehat{Y}_h) = \sigma^2\) /(\left [ \dfrac{1}{n} + \dfrac{(X_h - \overline{X})^2}{\sum_i(X_i-\overline{X})^2} \right ] \]

\[s^2(\widehat{Y}_h) = MSE \left [ \dfrac{1}{n} + \dfrac{(X_h-\overline{X})^2}{\sum_i(X_i - \overline{X})^2} \right ], \tag{5}\]


Normal linear regression model

In model specified by (1), if the random variables \(\epsilon_1, ..., \epsilon_n\) are independent and identically distributed as \(N(0,\sigma^2)\), then we have a normal linear regression model. This means that for each fixed value of \(X\), the conditional distribution of \(Y\) given \(X\) is \(N(\beta_0 + \beta_1X, \sigma^2)\).

Maximum likelihood estimation

Under this model, one can also obtain the estimates of \(b_0, b_1,\) and \(\sigma^2\) by method of maximum likelihood. This means that one treats the joint probability density function of \(Y_1, ..., Y_n\) given \(X_1, ..., X_n\)

\[f(Y_1, ..., Y_n|X_1, ..., X_n;\beta_0,\beta_1,\sigma^2) =\]

\[ \dfrac{1}{(\sigma\sqrt{2\pi})^n} \exp \left(\dfrac{-1}{2\sigma^2}\sum_{i=1}^n(Y_i - \beta_0 - \beta_1X_i)^2 \right )\]

as a function, say \(L(\beta_0,\beta_1,\sigma^2)\) of the parameters, and then maximizes this function w.r.t. the parameters by solving the equations:

\[\frac{\partial(log L)}{\partial\beta_0} = 0,\]

\[\frac{\partial(log L)}{\partial\beta_1} = 0,\]

\[\frac{\partial(log L)}{\partial\sigma^2} = 0 \tag{6}\]

to obtain the maximum likelihood estimates:

\[\widehat{\beta}_1 = \frac{\sum_{i=1}^n(X_i - \overline{X})(Y_i - \overline{Y})}{\sum_{i=1}^n(X_i - \overline{X})^2} = b_1,\]

\[\widehat{\beta}_0 = \overline{Y} - \widehat{\beta}_1\overline{X} = b_0\]

\[\widehat{\sigma^2} = \frac{1}{n}\sum_{i=1}^n(Y_i - \widehat{\beta}_0 - \widehat{\beta}_1X_i)^2 = \frac{n - 2}{n}MSE. \tag{7}.\]

Exact distribution

Under the normality assumption, we can compute exact distribution of certain random variables that are very important for conducting tests of hypotheses for the different parameters. We have, \(SSE\) and \((b_0,b_1)\) are independently distributed, and

\[​SSE ~ \sigma^2\chi^2_(n-2),\]

\[\dfrac{b_0 - \beta_0}{s(b_0}~t_(n-2),\]


\[\dfrac{b_1 - \beta_1}{s(b_1)}~t_(n-2) \tag{8}. \]

where \(\chi^2_k\) and \(t_k\) denote the Chi-square and t-distribution, respectively, with k degrees of freedom.


  • Scott Brunstein
  • Debashis Paul