10.1: A Recap of Modeling Assumptions

Last updated
Save as PDF

Page ID: 7247

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Recall from Chapter 4 that we identified three key assumptions about the error term that is necessary for OLS to provide unbiased, efficient linear estimators; a) errors have identical distributions, b) errors are independent, c) errors are normally distributed.¹⁷

Error Assumptions

Errors have identical distributions

E(ϵ2i)=σ2ϵE(ϵi2)=σϵ2

Errors are independent of XX and other ϵiϵi

E(ϵi)≡E(ϵ|xi)=0E(ϵi)≡E(ϵ|xi)=0

and

E(ϵi)≠E(ϵj)E(ϵi)≠E(ϵj) for i≠ji≠j

Errors are normally distributed

ϵi∼N(0,σ2ϵ)ϵi∼N(0,σϵ2)

Taken together these assumptions mean that the error term has a normal, independent, and identical distribution (normal i.i.d.). Figure \(\PageIndex{1}\) shows what these assumptions would imply for the distribution of residuals around the predicted values of YY given XX.

Figure \(\PageIndex{1}\): Assumed Distributions of OLS Residuals

How can we determine whether our residuals approximate the expected pattern? The most straightforward approach is to visually examine the distribution of the residuals over the range of the predicted values for YY. If all is well, there should be no obvious pattern to the residuals – they should appear as a “sneeze plot” (i.e., it looks like you sneezed on the plot. How gross!) as shown in Figure \(\PageIndex{2}\).

Figure \(\PageIndex{2}\): Ideal Pattern of Residuals from a Simple OLS Model

Generally, there is no pattern in such a sneeze plot of residuals. One of the difficulties we have, as human beings, is that we tend to look at randomness and perceive patterns. Our brains are wired to see patterns, even where they are none. Moreover, with random distributions, there will in some samples be clumps and gaps that do appear to depict some kind of order when in fact there is none. There is the danger, then, of over-interpreting the pattern of residuals to see problems that aren’t there. The key is to know what kinds of patterns to look for, so when you do observe one you will know it.