Multiple Linear Regression
( \newcommand{\kernel}{\mathrm{null}\,}\)
A response variable Y is linearly related to p different explanatory variables X(1),…,X(p−1) (where p≥2). The regression model is given by
Yi=β0+β1X(1)i+⋯+βpX(p−1)i+εi,i=1,…,n
where εi have mean zero, variance σ2 and are uncorrelated. The Equation ??? can be expressed in matrix notations as
Y=Xβ+ε,
where
Y=[Y1Y2⋅Yn],ε=[ε1ε2⋅εn],
X=[1X(1)1X(2)1⋯X(p−1)11X(1)2X(2)2⋯X(p−1)2⋅⋅⋅⋯⋅1X(1)nX(2)n⋯X(p−1)n],andβ=[β0β1⋅βp−1].
So X is an n×p matrix.
Estimation Problem
Note that β is estimated by the least squares procedure. That is minimizing the sum of squared errors ∑ni=1(Yi−β0−β1X(1)i−⋯−βp−1X(p−1)i)2. The latter quantity can be expressed in matrix notations as ∥Y−Xβ∥2. Minimization with respect to the parameter β (a p×1 vector) gives rise to the normal equations:
b0n+b1∑iX(1)i+b2∑iX(2)i+⋯+bp−1∑iX(p−1)i=∑iYib0∑iX(1)i+b1∑i(X(1)i)2+b2∑iX(1)iX(2)i+⋯+bp−1∑iX(1)iX(p−1)i=∑iX(1)iYi⋯⋯⋯⋯=⋅b0∑iX(p−1)i+b1∑iX(p−1)iX(1)i+b2∑iX(p−1)iX(2)i+⋯+bp−1∑i(X(p−1)i)2=∑iX(p−1)iYi
Observe that we can express this system of p equations in p variables b0,b1,…,bp−1 as XTXb=XTY, where b is a p×1 vector with bT=(b0,b1,…,bp−1).
Expected value and variance of random vectors
For an m×1 vector Z, with coordinates Z1,…,Zm, the expected value (or mean), and variance of Z are defined as
\boldsymbol{E(\mathbf{Z}) = E \begin{bmatrix} Z_1 \\ Z_2 \\ \cdot\\ Z_m \end{bmatrix} = \begin{bmatrix} E(Z_1) \\ E(Z_2)\\ \cdot\\ E(Z_m)\) \(\begin{bmatrix} \mbox{Var}(Z_1) & \mbox{Cov}(Z_1,Z_2) & \cdot & \mbox{Cov}(Z_1,Z_m) \\ \mbox{Cov}(Z_2,Z_1) & \mbox{Var}(Z_2) & \cdot & \mbox{Cov}(Z_2,Z_m) \\ \cdot & \cdot & \cdots & \cdot \\ \mbox{Cov}(Z_m,Z_1) & \mbox{Cov}(Z_m,Z_2) & \cdot & \mbox{Var}(Z_m) \end{bmatrix}.}
Observe that Var(Z) is an m×m matrix. Also, since Cov(Zi,Zj) = Cov(Zj,Zi) for all 1≤i,j≤m, Var(Z) is a symmetric matrix. Moreover, it can be checked, using the relationship that Cov(Zi,Zj)=E(ZiZj)−E(Zi)E(Zj), that Var(Z)=E(ZZT)−(E(Z))(E(Z))T.
Contributors
- Agnes Oshiro