# Diagnostics for residuals(continued)

- Page ID
- 230

Diagnostics for residuals (continued)

## Nonnormality of errors

This can be studied graphically by using the normal probability plot, or **Q-Q (standing for quantile-quantile) plot**. In this plot, the ordered residual (or observed quantiles) of the residuals are plotted aginst the expected quantiles assuming that \(\epsilon_i\)'s are approximately normal and independent with mean 0 and variance = MSE. This results in plotting the k-th largest e_{i} against

$${\sqrt{MSE}*z\left[\dfrac{k-0.375}{n+0.25}\right]},$$

where z(q) is the q-th quantile of N(0,1) distribution, where0<q<1. If the errors are normally distributed then the points on the plots should almost along the diagonal line. Departures from that could indicates skewness or heavier-tailed distributions.

(a) The model: \(Y = 2 + 3X + \epsilon\), where \(\epsilon\)~N(0,1). 100 observations, with Xi= i/10, i = 1,...,100

Coefficients |
Estimate |
Std. Error |
t-statistic |
P-value |
---|---|---|---|---|

Intercept | 1.5413 | 0.2196 | 7.02 | 2.92 * 10^{-10} |

Slope | 3.08907 | 0.03775 | 81.84 | <2 * 10^{-16} |

$${\sqrt{MSE}}$$= 1.09, R2 = 0.9856.

(b) True Model: \( Y = 2+3X+\epsilon\), where \(\epsilon\)~t5.. 100 observations, with Xi = i/10, i = 1...100.

Coefficients |
Estimate |
Std. Error |
t-statistic |
P-value |
---|---|---|---|---|

Intercept | 2.11144 | 0.28279 | 7.467 | 3.42*10^{-11} |

Slope | 2.97458 | 0.04862 | 61.185 | <2*10^{-16} |

$${\sqrt{MSE}} = 1.403,$$

with \(R^2 = 1.403\).

(c) True Model:\(Y = 2+3X+\epsilon\). where \(\epsilon\) ~ (x52 - 5). 100 observations, with Xi = i/10, i= 1...100.

Coefficients |
Estimate |
Std. Error |
t-statistic |
P-value |
---|---|---|---|---|

Intercept | 2.4615 | 0.6533 | 3.768 | 0.000281 |

Slope | 2.9894 | 0.1123 | 26.617 | <2*10^{-16} |

$${\sqrt{MSE}}$$ = 3.242, R2 = 0.8785.

(d) True Model:\(Y = 2+3X+\epsilon\). where \(\epsilon\) ~ (5-x52). 100 observations, with Xi = i/10, i= 1...100.

Coefficients |
Estimate |
Std. Error |
t-statistic |
P-value |
---|---|---|---|---|

Intercept | 2.7402 | 0.4694 | 6.838 | 6.87*10^{-8} |

Slope | 2.9896 | 0.0807 | 37.048 | <2*10^{-16} |

$${\sqrt{MSE}}= 2.329,$$

with \(R^2 = 0.9334\).

## Heteroscedasticity

Heteroscedasticity or unequal variance: the variance of the error \(\epsilon\)i may sometimes depend on the value of Xi. This is often reflected in the plot of residuals versus X through an unequal spread of the residuals along the X-axis.

One possibility is that the variance either increases or decreases with increasing value of X. This is often true for financial data, where the volume of transactions usually has a role in the uncertainty of the market. Another possibility is that the data may come from different strata with different variabilities. E.g. different measuring instruments, with different precisions, may have been used.

(a) True Model:\(Y = 2+3X+\epsilon\). where \(\epsilon\) ~ (5-x52). 100 observations, with Xi = i/10, i= 1...100.

Coefficients |
Estimate |
Std. Error |
t-statistic |
P-value |

Intercept | 1.0074 | 0.9729 | 1.035 | 0.303 |

Slope | 3.3382 | 0.1673 | 19.958 | <2*10^{-16} |

$${\sqrt{MSE}}$$ = 2.329, R2 = 0.9334.

## Contributors

- Chengcheng Zhang