Loading [MathJax]/jax/output/HTML-CSS/jax.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Statistics LibreTexts

Multiple Linear Regression

( \newcommand{\kernel}{\mathrm{null}\,}\)

A response variable Y is linearly related to p different explanatory variables X(1),,X(p1) (where p2). The regression model is given by

Yi=β0+β1X(1)i++βpX(p1)i+εi,i=1,,n

where εi have mean zero, variance σ2 and are uncorrelated. The Equation ??? can be expressed in matrix notations as

Y=Xβ+ε,

where

Y=[Y1Y2Yn],ε=[ε1ε2εn],

X=[1X(1)1X(2)1X(p1)11X(1)2X(2)2X(p1)21X(1)nX(2)nX(p1)n],andβ=[β0β1βp1].

So X is an n×p matrix.

Estimation Problem

Note that β is estimated by the least squares procedure. That is minimizing the sum of squared errors ni=1(Yiβ0β1X(1)iβp1X(p1)i)2. The latter quantity can be expressed in matrix notations as YXβ2. Minimization with respect to the parameter β (a p×1 vector) gives rise to the normal equations:

b0n+b1iX(1)i+b2iX(2)i++bp1iX(p1)i=iYib0iX(1)i+b1i(X(1)i)2+b2iX(1)iX(2)i++bp1iX(1)iX(p1)i=iX(1)iYi=b0iX(p1)i+b1iX(p1)iX(1)i+b2iX(p1)iX(2)i++bp1i(X(p1)i)2=iX(p1)iYi

Observe that we can express this system of p equations in p variables b0,b1,,bp1 as XTXb=XTY, where b is a p×1 vector with bT=(b0,b1,,bp1).

If the p×p matrix XTX is nonsingular (as we shall assume for the time being), then the solution to this system is given by ˆβ=b=(XTX)1XTY. This is the least squares estimate of β.

Expected value and variance of random vectors

For an m×1 vector Z, with coordinates Z1,,Zm, the expected value (or mean), and variance of Z are defined as

\boldsymbol{E(\mathbf{Z}) = E \begin{bmatrix} Z_1 \\ Z_2 \\ \cdot\\ Z_m \end{bmatrix} = \begin{bmatrix} E(Z_1) \\ E(Z_2)\\ \cdot\\ E(Z_m)\) \(\begin{bmatrix} \mbox{Var}(Z_1) & \mbox{Cov}(Z_1,Z_2) & \cdot & \mbox{Cov}(Z_1,Z_m) \\ \mbox{Cov}(Z_2,Z_1) & \mbox{Var}(Z_2) & \cdot & \mbox{Cov}(Z_2,Z_m) \\ \cdot & \cdot & \cdots & \cdot \\ \mbox{Cov}(Z_m,Z_1) & \mbox{Cov}(Z_m,Z_2) & \cdot & \mbox{Var}(Z_m) \end{bmatrix}.}

Observe that Var(Z) is an m×m matrix. Also, since Cov(Zi,Zj) = Cov(Zj,Zi) for all 1i,jm, Var(Z) is a symmetric matrix. Moreover, it can be checked, using the relationship that Cov(Zi,Zj)=E(ZiZj)E(Zi)E(Zj), that Var(Z)=E(ZZT)(E(Z))(E(Z))T.

Contributors

  • Agnes Oshiro

Multiple Linear Regression is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

  • Was this article helpful?

Support Center

How can we help?