Skip to main content
Statistics LibreTexts

6.6: End-of-Chapter Materials

  • Page ID
    57736
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    R Functions

    In this chapter, we were introduced to many, many, many R functions that will be useful in regression. In fact, this chapter uses more R functions than any other chapter in this book. Here are the many.

    Packages

    • car package hexcar
      This package provides several statistical tests used in the book "An R Companion to Applied Regression'" by J. Fox and S. Weisberg. It is a great package that provides a lot of additional functionality for R.
    • lawstat
      This package provides several statistical tests used in law and public policy analysis. It provides the runs.test function for us.
    • lmtest
      This package provides many tests related to linear models. It provides an implementation of the Breusch-Pagan test, bptest, which tests for heteroskedasticity in the residuals.
    • KnoxStats pacakge hexKnoxStats
      This package was designed to make some regression analysis a bit easier. It contains an extension to the runs.test to make it easier to implement in the realm of regression diagnostics.

    Statistics

    • source(filename)
      This function runs an R script from a separate file. That file may be local or on the Internet.
    • runs.test(E, order)
      This KnoxStats alteration to the lawstat function tests whether the variable E, as ordered by order exhibits fit issues.
    • shapiroTest(E)
      This tests the null hypothesis that the variable E comes from a Normal distribution. It is based on the shapiro.test() function in the basic R installation. It adds capabilities to test Normality in multiple groups.
    • lm(formula)
      This is the function that performs ordinary least squares estimation on linear models.
    • bptest(mod)
      This function from the lmtest package performs the Bresuch-Pagan test for heteroskedasticity.
    • confint(mod)
      This calculates confidence intervals for the parameters in ordinary least squares regression.
    • mean(x)
      This calculates the mean of a sample.
    • summary(x)
      This produces the six-number summary or a frequency table of the provided variable, depending on the type of variable.
    • summary.lm(mod)
      When applied to a linear model fit using either the aov function or the lm function, this provides estimates of the effects of the numeric variables and the levels of the categorical variables in the model.
    • summary.aov(mod)
      When applied to a linear model fit using either the aov function or the lm function, this provides estimates of the statistical significance of the variables in the model.
    • predict(mod)
      This predicts the values of the dependent variable at each point in the dataset or for the values specified.
    • fligner.test(formula)
      This tests for heteroskedasticity when the independent variable is categorical.
    • aov(formula)
      This function performs ordinary least squares estimation on linear models.
    • vif(model)
      This function calculates the variance inflation factor (VIF) for each of the independent variables in the model.
    • set.base(var,level)
      This KnoxStats package function redefines the base category of the provided level. By default, the base category is the first according to the alphabet.

    Probability

    • set.seed(x)
      This sets the random number seed. It is done to ensure the replicability of experiments.
    • rexp(n, rate)
      This generates n random values from an Exponential distribution with the specified rate parameter.
    • rnorm(n, mean, sd)
      This generates n random values from a Normal distribution with specified mean and standard deviation. By default, the mean is 0 and the standard deviation is 1.
    • runif(n, min, max)
      This generates n random values from a Uniform distribution with specified minimum and maximum values. By default, the minimum is 0 and the maximum is 1.

    Mathematics

    • head(x)
      This returns the first six values in the variable.
    • foot(x)
      This returns the last six values in the variable.
    • seq(from, to, by, length)
      This returns a vector of sequential values, where by indicates the step size and length specifies the vector length.
    • length(x)
      This calculates the length of a vector (variable), which is the sample size, n.
    • residuals(mod)
      This calculates the residuals in the model. Recall that the residuals are the difference between the observed and the predicted.

    Graphics

    • qqnorm(x)
      This creates a Normal quantile-quantile plot for the given values.
    • qqline(x)
      This adds the diagonal line to the quantile-quantile plot.
    • overlay(x)
      This, from the KnoxStats package, produces a histogram with a Normal curve overlaying it.
    • par(...)
      This sets the parameters for the next graphic to be started. Look through the help page for this function to see all you can specify.
    • plot(x,y)
      This produces a scatter plot of the y-values against the x-values.
    • axis(side)
      When a plot is already drawn, this adds values along axis number side.
    • title(...)
      When a plot is already drawn, this adds the x- and y-labels.
    • lines(x,y)
      When a plot is already drawn, this draws lines between each subsequent (x, y) pair.
    • points(x,y)
      When a plot is already drawn, this draws points at each (x, y) pair.

    Programming

    • attach(dataframe)
      This allows you to access the variables in the dataframe without having to prefix each with dataframe$.
    • library(package)
      This loads an external package that you have already installed on your computer. It allows access to all functions and data sets in the package package.
    • as.character(x)
      This changes the values in variable x to be characters.
    • as.numeric(x)
      This changes the values in variable x to be numbers.

    Exercises

    1. Show that if the expected value of the residuals is constant, but non-zero, then the OLS estimator of \(\beta_1\) remains unbiased.
    2. Show that \(E[b_0] = \beta_0 + E[\varepsilon]\), if x̄ = 0, regardless of whether the residuals are correlated with the independent variable.

    Applied Readings

    • Joshua D. Angrist and Alan B. Krueger (1991). "Does Compulsory School Attendance Affect Schooling and Earnings?" The Quarterly Journal of Economics 106(4): 979–1014.
      doi:10.2307/2937954
    • Eugene F. Fama and Kenneth R. French (1992). "The Cross-Section of Expected Stock Returns." The Journal of Finance 47(2): 427–65.
      doi:10.2307/2329112
    • Jerayr Haleblian and Sydney Finkelstein (1999). "The Influence of Organizational Acquisition Experience on Acquisition Performance: A Behavioral Learning Perspective." Administrative Science Quarterly 44(1), 29–56.
      doi:10.2307/2667030

    Theory Readings

    • George E. P. Box (1976). "Science and Statistics," Journal of the American Statistical Association, 71(356): 791–799.
      doi:10.1080/01621459.1976.10480949
    • James V. Bradley (1968). Distribution-Free Statistical Tests. New York: Prentice-Hall.
    • John Fox and Harvey Sanford Weisberg (2019). An R Companion to Applied Regression, third edition. Thousand Oaks, CA: SAGE Publications.
      ISBN: 9781544336473
    • Frank J. Massey, Jr. (1951). "The Kolmogorov-Smirnov Test for Goodness of Fit." Journal of the American Statistical Society. 46(253): 68–78.
      JSTOR: 2280095
    • Abraham Wald and Jack Wolfowitz (1940). "On a test whether two samples are from the same population." The Annals of Mathematical Statistics 11(2): 147–162.
      JSTOR: 2235872

    This page titled 6.6: End-of-Chapter Materials is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Ole Forsberg.

    • Was this article helpful?