Parameter Estimation in Simple Linear Regression

Parameter estimation in simple linear regression

Model: $$X$$ and $$Y$$ are the predictor and response variables, respectively. Fit the model,

$Y_i = \beta_0 + \beta_1X_i + \epsilon_i, i = 1,...,n \tag{1}$

where $$\epsilon_1,...,\epsilon_n$$ are uncorrelated, E$$(\epsilon_i)$$ = 0, Var$$(\epsilon_i) = \sigma^2$$ for all $$i$$.

Estimates of the parameters: We have the following estimates for $$\beta_0, \beta_1,$$ and $$\sigma^2$$, respectively.

$b_0 = \overline{Y} - b_1 \overline{x},$

$b_1 = \dfrac{\sum_{i=1}^n(X_i - \overline{X})(Y_i - \overline{Y})}{\sum_{i=1}^n(X_i - \overline{X})^2},$

$\widehat{\sigma}^2 = MSE = \frac{SSE}{n-2} \tag{2},$

where

$SSE = \sum_{i=1}^n(Y_i - b_0 - b_1X_i)^2 = \sum_{i=1}^n(Y_i - \overline{Y})^2 - b_1^2\sum_{i=1}^n(X_i - \overline{X})^2 .$

• Prediction: The predicted value of $$Y$$, given $$X = X_h$$ is $$\widehat{Y}_h = b_0 + b_1X_h = \overline{Y} + b_1(X_h - \overline{X})$$.
• Expected values and variances: Under the assumptions of the simple linear regression model, we have $$E(b_0) = \beta_0, E(b_1) = \beta_1$$ and $$E(\widehat{\sigma}^2) = E(MSE) = \sigma^2$$. In other words, the estimators $$b_0, b_1, \widehat{\sigma}^2$$ are unbiased. Also, $$E(\widehat{Y}_h|X_h) = \beta_0 + \beta_1X_h$$.

Assuming that $$X_1,...,X_n$$ are non-random, the variances of $$b_0$$ and $$b_1$$ are given by:

$\sigma^2(b_0) = \sigma^2\left [ \frac{1}{n} + \frac{\overline{X}^2}{\sum_i(X_i-\overline{X})^2} \right ],$

and

$\sigma^2(b_1) = \frac{\sigma^2}{ \sum_{i=1}^n (X_i - \overline{X})^2 } \tag{3}.$

Replacing $$\sigma^2$$ by $$MSE$$, we obtain the estimates of the variances of $$\beta_0$$ and $$beta_1$$, and these are denoted by

$s^2(b_0) = MSE \left [ \frac{1}{n} + \frac{\overline{X}^2}{\sum_i(X_i - \overline{X})^2} \right ],$

and

$s^2(b_1) = \frac{MSE}{\sum_{i=1}^n(X_i - \overline{X})^2}, \tag{4}$

respectively. Thus, $$s(b_0)$$ and $$s(b_1)$$ are the estimated standard errors of the estimators of $$\beta_0$$ and $$\beta_1$$, respectively.

Similarly, the variance and its estimate of $$\widehat{Y}_h$$ are

$\sigma^2(\widehat{Y}_h) = \sigma^2\) /(\left [ \dfrac{1}{n} + \dfrac{(X_h - \overline{X})^2}{\sum_i(X_i-\overline{X})^2} \right ]$

$s^2(\widehat{Y}_h) = MSE \left [ \dfrac{1}{n} + \dfrac{(X_h-\overline{X})^2}{\sum_i(X_i - \overline{X})^2} \right ], \tag{5}$

respectively.

Normal linear regression model

In model specified by (1), if the random variables $$\epsilon_1, ..., \epsilon_n$$ are independent and identically distributed as $$N(0,\sigma^2)$$, then we have a normal linear regression model. This means that for each fixed value of $$X$$, the conditional distribution of $$Y$$ given $$X$$ is $$N(\beta_0 + \beta_1X, \sigma^2)$$.

Maximum likelihood estimation

Under this model, one can also obtain the estimates of $$b_0, b_1,$$ and $$\sigma^2$$ by method of maximum likelihood. This means that one treats the joint probability density function of $$Y_1, ..., Y_n$$ given $$X_1, ..., X_n$$

$f(Y_1, ..., Y_n|X_1, ..., X_n;\beta_0,\beta_1,\sigma^2) =$

$\dfrac{1}{(\sigma\sqrt{2\pi})^n} \exp \left(\dfrac{-1}{2\sigma^2}\sum_{i=1}^n(Y_i - \beta_0 - \beta_1X_i)^2 \right )$

as a function, say $$L(\beta_0,\beta_1,\sigma^2)$$ of the parameters, and then maximizes this function w.r.t. the parameters by solving the equations:

$\frac{\partial(log L)}{\partial\beta_0} = 0,$

$\frac{\partial(log L)}{\partial\beta_1} = 0,$

$\frac{\partial(log L)}{\partial\sigma^2} = 0 \tag{6}$

to obtain the maximum likelihood estimates:

$\widehat{\beta}_1 = \frac{\sum_{i=1}^n(X_i - \overline{X})(Y_i - \overline{Y})}{\sum_{i=1}^n(X_i - \overline{X})^2} = b_1,$

$\widehat{\beta}_0 = \overline{Y} - \widehat{\beta}_1\overline{X} = b_0$

$\widehat{\sigma^2} = \frac{1}{n}\sum_{i=1}^n(Y_i - \widehat{\beta}_0 - \widehat{\beta}_1X_i)^2 = \frac{n - 2}{n}MSE. \tag{7}.$

Exact distribution

Under the normality assumption, we can compute exact distribution of certain random variables that are very important for conducting tests of hypotheses for the different parameters. We have, $$SSE$$ and $$(b_0,b_1)$$ are independently distributed, and

$​SSE ~ \sigma^2\chi^2_(n-2),$

$\dfrac{b_0 - \beta_0}{s(b_0}~t_(n-2),$

and

$\dfrac{b_1 - \beta_1}{s(b_1)}~t_(n-2) \tag{8}.$

where $$\chi^2_k$$ and $$t_k$$ denote the Chi-square and t-distribution, respectively, with k degrees of freedom.

Contributors

• Scott Brunstein
• Debashis Paul