Skip to main content
Statistics LibreTexts

Least squares principle

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Least squares principle is a widely used method for obtaining the estimates of the parameters in a statistical model based on observed data. Suppose that we have measurements \(Y_1,\ldots,Y_n\) which are noisy versions of known functions \(f_1(\beta),\ldots,f_n(\beta)\) of an unknown parameter \(\beta\). This means, we can write

    \[ Y_i = f_i(\beta) + \varepsilon_i, i=1,\ldots,n \]

    where \(\varepsilon_1,\ldots,\varepsilon_n\) are quantities that measure the departure of the observed measurements from the model, and are typically referred to as noise. Then the least squares estimate of \(\beta\) from this model is defined as

    \[ \widehat\beta = \min_{\beta} \sum_{i=1}^n(Y_i - f_i(\beta))^2 \]

    The quantity \(f_i(\widehat\beta)\) is then referred to as the fitted value of \(Y_i\), and the difference \(Y_i - f_i(\widehat\beta)\) is referred to as the corresponding residual. It should be noted that \(\widehat\beta\) may not be unique. Also, even if it is unique it may not be available in a closed mathematical form. Usually, if each \(f_i\) is a smooth function of \(\beta\), one can obtain the estimate \(\widehat\beta\) by using numerical optimization methods that rely on taking derivatives of the objective function. If the functions \(f_i(\beta)\) are linear functions of \(\beta\), as is the case in a linear regression problem, then one can obtain the estimate \(\widehat\beta\) in a closed form.


    • Debashis Paul

    This page titled Least squares principle is shared under a not declared license and was authored, remixed, and/or curated by Debashis Paul.

    • Was this article helpful?