Skip to main content
Statistics LibreTexts

5: Improved! Now with Probabilities

  • Page ID
    57723
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    The Bridge over the River Strešlau

    This chapter extends the mathematics from last chapter by adding a probability distribution to the residuals. This results in the independent variable having a probability distribution. Please keep in mind that the independent variables are not random variables. The researcher specifically selects their values. Adhering to this paradigm allows us to more easily determine the resulting distributions. As such, this chapter continues this requirement. Should we not adhere to this requirement, the results of this chapter will technically be wrong, but will be close if the independent variable is statistically independent of the dependent variable.

    ✦•················• 🐟 •··················•✦

    Scatter plot of sample data
    Figure \(\PageIndex{1}\): The basic scatter plot. This provides the observed values of the data as well as the line of best fit according to the Ordinary Least Squares method. The residuals are also indicated, with the values represented by dotted segments.

     

    In the previous chapter, we explored the mathematical consequences of our choice of definition of "best." In this chapter, we will acknowledge that the residuals are observations from a random variable, specify its distribution, and see where that takes us.

    And so, let us return to our scalar model for our data (Figure \(\PageIndex{1}\)), above:

    \begin{equation}
    y_i = \beta_0 + \beta_1 x_i + \varepsilon_i \label{eq:lm3-scalarModel}
    \end{equation}

    and see what we can learn if we make the assumption that the \(\varepsilon_i\) are generated from a Normal distribution. Specifically, in conjunction with our previous assumptions, let us assume:

    \begin{equation}
    \varepsilon_i \stackrel{\text{iid}}{\sim} N\left(0,\ \sigma^2 \right)
    \end{equation}

     

    That single probability statement actually contains four parts:

    1. The residuals follow a Normal distribution. No matter the values of the other variables, the residuals follow a Normal distribution.
    2. The expected value of \(\varepsilon_i\) is a constant 0. No matter the values of the other variables, the expected value of the residual is 0 at that point.
    3. The variance of the \(\varepsilon_i\) is a constant \(\sigma^2\). No matter the values of the other variables, the variance of the residual is \(\sigma^2\) at that point.
    4. The abbreviation "iid" on top of the distribution sign means "independent and identically distributed." It indicates that the \(\varepsilon_i\) are independent of each other, and that the distribution of each is the same, \(N\left(0,\ \sigma^2 \right)\).

    On the right-hand side (RHS) of Equation \(\ref{eq:lm3-scalarModel}\), the \(\varepsilon_i\) is the only random variable. The \(\beta_0\) and \(\beta_1\) are population parameters we are trying to estimate. The \(x_i\) are values selected by the experimenter, so they are also not random variables. This last sentence is rather important for a lot of the calculations we make. The values of the independent variable are selected by the researcher, they are not realizations of a random variable.

    Since the only thing on the right hand side that is a random variable is the \(\varepsilon_i\), then it is rather easy to determine the distribution of \(Y\). And, with that, we are able to determine the distribution of almost all parameters we find important.

    Important Idea

    The RHS of Equation \(\ref{eq:lm3-scalarModel}\) is actually in two parts. The \(\varepsilon_i\) part is the source of the randomness, it is the "stochastic" part. The rest has no randomness associated with it. It is called the "systematic" part:

    \begin{equation}
    y_i = \quad \underbrace{\beta_0 + \beta_1 x_i}_{systematic} \quad + \underbrace{\varepsilon_i}_{stochastic}
    \end{equation}

    Note that the systematic part contains the effects of the variables you are measuring and including in your model. The stochastic part contains everything else.

     

    Let's continue exploring the consequences of making these additional assumptions.

     

    Learning Objectives

    By the end of this chapter, you should be able to:

    1. Probability Distributions of Regression Statistics
      • State the sampling distributions of the OLS estimator \(b_i\) and the estimated error variance \(\mathrm{MSE}\) under the Classical Linear Model assumptions.

      • Explain why \(\mathbf{b}\) follows a multivariate normal distribution and why \( (n-p)\mathrm{MSE}/\sigma^2\) follows a chi-square distribution (where n is sample size and p is number of parameters).

      • Derive that the standardized coefficient \( (b_i - \beta_i)/\mathrm{SE}(b_i)\) follows a t-distribution with n-p degrees of freedom.

    2. Hypothesis Testing for Regression Coefficients
      • Formulate null and alternative hypotheses about individual regression coefficients (e.g., \(H_0 : \beta_i = 0\) vs. \(H_1 : \beta_i \ne 0\)).

      • Calculate the t-test statistic for a regression coefficient using its estimate and standard error.

      • Interpret the p-value from a coefficient test in context, making appropriate conclusions about statistical significance while avoiding common misinterpretations.

      • Perform hypothesis tests for linear combinations of coefficients (contrasts) when relevant.

    3. Confidence Intervals for Parameters and Predictions
      • Construct and interpret a confidence interval for an individual regression coefficient \(b_i\).

      • Construct and interpret a confidence interval for the mean response (\(\mathrm{E}[Y | X = x]\)) at a given set of predictor values.

      • Distinguish between a confidence interval for the mean response and a prediction interval for an individual future observation, explaining why prediction intervals are necessarily wider.

    4. Simultaneous Inference with Working-Hotelling Bands
      • Identify the problem of multiple comparisons when making inferences about the regression line at many different \(X\) values.

      • Explain how Working-Hotelling confidence bands provide a confidence region for the entire regression line (all possible mean responses) with simultaneous coverage.

      • Compare and contrast pointwise confidence intervals with simultaneous confidence bands, recognizing when each is appropriate.

    5. Synthesis: From Estimation to Inference
      • Conduct a complete statistical inference procedure for a regression coefficient: from stating hypotheses, to computing the test statistic and p-value, to constructing a confidence interval, and interpreting results in substantive terms.

      • Critically evaluate regression output from statistical software, extracting and interpreting key inference statistics (coefficient estimates, standard errors, t-values, p-values, and confidence intervals).

      • Communicate inferential findings appropriately, distinguishing between statistical significance and practical importance.

      

    · · ─ ·✶· ─ · ·

    Chapter Sections

     

     


    This page titled 5: Improved! Now with Probabilities is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Ole Forsberg.

    • Was this article helpful?