Skip to main content
Statistics LibreTexts

5.3: Confidence Intervals

  • Page ID
    57726
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    In the previous section, we examined hypothesis testing. This required that we create a test statistic and determine its distribution. One can think of confidence intervals as the dual of test statistics. Test statistics are functions of an unknown population parameter and have a distribution. Confidence intervals are for that unknown population parameter, where a probability is known (assumed). Once a person has the test statistic and its definition, the confidence interval can be determined by inverting the test statistic function (solve for the parameter).

    ✦•················• ✦ •··················•✦

    fig-ch01_patchfile_01.jpg
    Figure \(\PageIndex{1}\): An illustration of a confidence interval seen from the standpoint of \(T\) or from \(\overline{X}\). The unshaded area constitutes 95% of the area under the curve. Thus, the vertical segments delimit the endpoints of a central 95% confidence interval.

    From your elementary statistics course, you learned that the distribution of

    \begin{equation}
    T = \frac{\bar{x} - \mu}{s/\sqrt{n}}
    \end{equation}

    followed a Student's t distribution with \(n-1\) degrees of freedom. Solving the formula for the parameter of interest, \(\mu\), gives

    \begin{equation}
    \mu = \bar{x} - T\ \frac{s}{\sqrt{n}} \label{lm3-eqn:confintT}
    \end{equation}

    The interpretation of \(T\) here is that it contains the values (quantiles) that correspond to the confidence level claimed (Figure \(\PageIndex{1}\)). For instance, if you desire a 95% confidence interval for a sample of size 10, the central T values are \(\pm 2.262\) because the probability \(P[-2.262 < t < 2.262] = 0.95\).

    Thus, the interpretation of \(\mu\) in Equation \(\ref{lm3-eqn:confintT}\) is that it contains the values that correspond to the endpoints of the confidence level claimed for the distribution of the right-hand side of the formula.

    This interpretation holds for all confidence intervals.

    With this discussion, it is rather straightforward to calculate the endpoints of confidence intervals for all of the population parameters we have explored thus far.

    When the distribution of the test statistic is unimodal and symmetric (like the Normal or the Student's t), the central confidence interval is also the narrowest. Obtaining the narrowest interval may be important if the researcher desires the most precise estimate of the population parameter

    Theorem \(\PageIndex{1}\): Confidence Interval for β1

    The endpoints of a central \((1-\alpha) \times 100%\) confidence for \(\beta_1\) are defined by

    \begin{equation}
    b_1 \pm t_{\alpha/2,n-p}\sqrt{\ \mathrm{MSE}/S_{xx}\ }
    \end{equation}

    Proof.
    From our theorem on the distribution of β1, we know \(T = \frac{b_1 - \beta_1}{\sqrt{\mathrm{MSE}/S_{xx}}}\) follows a \(t\) distribution with \(n-p\) degrees of freedom. Solving this for \(\beta_1\) gives

    \begin{equation}
    \beta_1 = b_1 - T \sqrt{\mathrm{MSE}/S_{xx}}
    \end{equation}

    Because the distribution of \(T\) is symmetric unimodal, the endpoints of the minimum-width interval for \(T\) correspond to the two quantiles \(t_{\alpha/2, n-p}\) and \(t_{1-\alpha/2, n-p}\). These two endpoints are equivalent to \(\pm t_{\alpha/2, n-p}\).

    As such, the endpoints of a minimum-length \((1-\alpha) \times 100%\) confidence for \(\beta_1\) are defined by

    \begin{equation}
    b_1 \pm t_{\alpha/2,n-p}\sqrt{\mathrm{MSE}/S_{xx}}
    \end{equation}

    \(\blacksquare\)

    This is a typical result when dealing with the Student's t distribution.

    By the way, there is absolutely no reason we need a minimum-width confidence interval. It is, however, useful in maximizing the precision of the estimate. But, that's about all it is good for. When the distribution of the test statistic is unimodal and symmetric, the central interval and the minimum-width interval are identical. When the distribution is not symmetric, they are not. The following illustrates this.

    Theorem \(\PageIndex{2}\): Confidence Interval for the MSE

    The endpoints of a central \((1-\alpha)100%\) confidence for \(\sigma^2\) are defined by

    \begin{equation}
    \frac{(n-p)\ \mathrm{MSE}}{\chi^2_{1-\alpha/2,n-p}} \qquad \text{and} \qquad \frac{(n-p)\ \mathrm{MSE}}{\chi^2_{\alpha/2,n-p}}
    \end{equation}

    Proof.
    From the MSE Theorem, we know

    \begin{align}
    \frac{(n-p)\ \mathrm{MSE}}{\sigma^2} &\sim \chi^2_{n-p}
    \to \qquad \sigma^2 &= \frac{(n-p)\ \mathrm{MSE}}{\chi^2_{n-p}}
    \end{align}

    Thus, a central \((1-\alpha)100%\) confidence interval (see Figure \(\PageIndex{2}\), below) is defined by the endpoints

    \begin{align}
    \frac{(n-p)\ \mathrm{MSE}}{\chi^2_{n-p, 1-\alpha/2}} &\qquad \text{and} \qquad \frac{(n-p)\ \mathrm{MSE}}{\chi^2_{n-p,\alpha/2}}
    \end{align}

    fig-ch01_patchfile_01.jpg
    Figure \(\PageIndex{2}\): A plot of the chi-square distribution with 4 degrees of freedom. The unshaded area constitutes 90% of the area under the curve. Thus, the vertical segments delimit the endpoints of a central 90% confidence interval.
    Note

    This is not the minimum-width interval. It is, however, the usual confidence interval provided. Calculating the minimum-width interval takes a little calculus that is beyond the scope of this section... and the typical coverage of this topic.

    The minimum-width interval is illustrated in Figure \(\PageIndex{3}\), below. Note that the area in the shaded area to the right is not the same as that to the left. However, the two areas still account for 10% of the area, leaving 90% unshaded in the middle.

    fig-ch01_patchfile_01.jpg
    Figure \(\PageIndex{3}\): A plot of the chi-square distribution with 4 degrees of freedom. The shaded area constitutes 10% of the area under the curve. Thus, the vertical segments delimit the endpoints of a 90% confidence interval. This confidence interval, however, is the minimum-width interval.

    The width of the central 90% confidence interval shown in Figure \(\PageIndex{2}\) is 8.777. This is wider than the width of the minimum-width confidence interval shown in Figure \(\PageIndex{3}\) , which is 7.714. The minimum-width interval is 12% narrower than the central interval. That is an increase in estimator efficiency.

    How do we obtain the minimum-width interval? Let's make that topic beyond the scope of this course. However, as a teaser, notice that the value of the density function for each of the two endpoints is the same in the minimum-width interval. If the distribution is unimodal, then that observation will be true. That's enough of a hint. Feel free to explore this on your own. Calculus will serve you well here.

    Theorem \(\PageIndex{3}\): Confidence Interval for β0

    The endpoints of a central (and minimum width) confidence interval for \(\beta_0\) are defined by

    \begin{equation}
    b_0 \pm t_{\alpha/2,n-p}\sqrt{\ \mathrm{MSE}\ \left( \frac{1}{n} + \frac{\bar{x}^2}{S_{xx}}\right)}
    \end{equation}

    Proof. I leave this as an exercise.

    Theorem \(\PageIndex{4}\): Confidence Interval for Y

    The endpoints of a confidence interval for \(\hat{y}\) are defined by

    \begin{equation}
    b_0 + b_1 x \pm t_{\alpha/2,n-p}\sqrt{\ \mathrm{MSE}\ \left( \frac{1}{n} + \frac{(x-\bar{x})^2}{S_{xx}}\right)}
    \end{equation}

    Proof. I leave this as an exercise.

    Theorem \(\PageIndex{5}\): Prediction Interval for Y

    The endpoints of a prediction interval for \(y({new}\) are defined by

    \begin{equation}
    b_0 + b_1 x \pm t_{\alpha/2,n-p}\sqrt{\ \mathrm{MSE}\ \left( 1 + \frac{1}{n} + \frac{(x-\bar{x})^2}{S_{xx}}\right)}
    \end{equation}

    Proof. I leave this as an exercise.

    Note

    This interval (Theorem \(\PageIndex{5}\)) is termed a "prediction interval" because it is used to predict a new observation of \(y\). It is not used to estimate the expected value of \(y\) — or trends in \(y\). That would be the purpose of a confidence interval.


    This page titled 5.3: Confidence Intervals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Ole Forsberg.

    • Was this article helpful?