Skip to main content
Statistics LibreTexts

10.1: A Recap of Modeling Assumptions

  • Page ID
    7247
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Recall from Chapter 4 that we identified three key assumptions about the error term that is necessary for OLS to provide unbiased, efficient linear estimators; a) errors have identical distributions, b) errors are independent, c) errors are normally distributed.17

    Error Assumptions

    • Errors have identical distributions

    E(ϵ2i)=σ2ϵE(ϵi2)=σϵ2

    • Errors are independent of XX and other ϵiϵi

    E(ϵi)≡E(ϵ|xi)=0E(ϵi)≡E(ϵ|xi)=0

    and

    E(ϵi)≠E(ϵj)E(ϵi)≠E(ϵj) for i≠ji≠j

    • Errors are normally distributed

    ϵi∼N(0,σ2ϵ)ϵi∼N(0,σϵ2)

    Taken together these assumptions mean that the error term has a normal, independent, and identical distribution (normal i.i.d.). Figure \(\PageIndex{1}\) shows what these assumptions would imply for the distribution of residuals around the predicted values of YY given XX.

    residdist-1.png
    Figure \(\PageIndex{1}\): Assumed Distributions of OLS Residuals

    How can we determine whether our residuals approximate the expected pattern? The most straightforward approach is to visually examine the distribution of the residuals over the range of the predicted values for YY. If all is well, there should be no obvious pattern to the residuals – they should appear as a “sneeze plot” (i.e., it looks like you sneezed on the plot. How gross!) as shown in Figure \(\PageIndex{2}\).

    sneeze-1.png
    Figure \(\PageIndex{2}\): Ideal Pattern of Residuals from a Simple OLS Model

    Generally, there is no pattern in such a sneeze plot of residuals. One of the difficulties we have, as human beings, is that we tend to look at randomness and perceive patterns. Our brains are wired to see patterns, even where they are none. Moreover, with random distributions, there will in some samples be clumps and gaps that do appear to depict some kind of order when in fact there is none. There is the danger, then, of over-interpreting the pattern of residuals to see problems that aren’t there. The key is to know what kinds of patterns to look for, so when you do observe one you will know it.


    This page titled 10.1: A Recap of Modeling Assumptions is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Jenkins-Smith et al. (University of Oklahoma Libraries) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.