Skip to main content
Statistics LibreTexts

Diagnostics for residuals(continued)

  • Page ID
    230
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Diagnostics for residuals (continued)

    Nonnormality of errors

    This can be studied graphically by using the normal probability plot, or Q-Q (standing for quantile-quantile) plot. In this plot, the ordered residual (or observed quantiles) of the residuals are plotted aginst the expected quantiles assuming that \(\epsilon_i\)'s are approximately normal and independent with mean 0 and variance = MSE. This results in plotting the k-th largest ei against

    $${\sqrt{MSE}*z\left[\dfrac{k-0.375}{n+0.25}\right]},$$

    where z(q) is the q-th quantile of N(0,1) distribution, where0<q<1. If the errors are normally distributed then the points on the plots should almost along the diagonal line. Departures from that could indicates skewness or heavier-tailed distributions.

    (a) The model: \(Y = 2 + 3X + \epsilon\), where \(\epsilon\)~N(0,1). 100 observations, with Xi= i/10, i = 1,...,100

    Coefficients Estimate Std. Error t-statistic P-value
    Intercept 1.5413 0.2196 7.02 2.92 * 10-10
    Slope 3.08907 0.03775 81.84 <2 * 10-16

    $${\sqrt{MSE}}$$= 1.09, R2 = 0.9856.

    qqplot_norm.gif

    (b) True Model: \( Y = 2+3X+\epsilon\), where \(\epsilon\)~t5.. 100 observations, with Xi = i/10, i = 1...100.

    Coefficients Estimate Std. Error t-statistic P-value
    Intercept 2.11144 0.28279 7.467 3.42*10-11
    Slope 2.97458 0.04862 61.185 <2*10-16

    $${\sqrt{MSE}} = 1.403,$$

    with \(R^2 = 1.403\).

    (c) True Model:\(Y = 2+3X+\epsilon\). where \(\epsilon\) ~ (x52 - 5). 100 observations, with Xi = i/10, i= 1...100.

    Coefficients Estimate Std. Error t-statistic P-value
    Intercept 2.4615 0.6533 3.768 0.000281
    Slope 2.9894 0.1123 26.617 <2*10-16

    $${\sqrt{MSE}}$$ = 3.242, R2 = 0.8785.

    (d) True Model:\(Y = 2+3X+\epsilon\). where \(\epsilon\) ~ (5-x52). 100 observations, with Xi = i/10, i= 1...100.

    Coefficients Estimate Std. Error t-statistic P-value
    Intercept 2.7402 0.4694 6.838 6.87*10-8
    Slope 2.9896 0.0807 37.048 <2*10-16

    $${\sqrt{MSE}}= 2.329,$$

    with \(R^2 = 0.9334\).

    Heteroscedasticity

    Heteroscedasticity or unequal variance: the variance of the error \(\epsilon\)i may sometimes depend on the value of Xi. This is often reflected in the plot of residuals versus X through an unequal spread of the residuals along the X-axis.

    One possibility is that the variance either increases or decreases with increasing value of X. This is often true for financial data, where the volume of transactions usually has a role in the uncertainty of the market. Another possibility is that the data may come from different strata with different variabilities. E.g. different measuring instruments, with different precisions, may have been used.

    (a) True Model:\(Y = 2+3X+\epsilon\). where \(\epsilon\) ~ (5-x52). 100 observations, with Xi = i/10, i= 1...100.

    Coefficients Estimate Std. Error t-statistic P-value
    Intercept 1.0074 0.9729 1.035 0.303
    Slope 3.3382 0.1673 19.958 <2*10-16

    $${\sqrt{MSE}}$$ = 2.329, R2 = 0.9334.

    Contributors

    • Chengcheng Zhang

    This page titled Diagnostics for residuals(continued) is shared under a not declared license and was authored, remixed, and/or curated by Debashis Paul.