Skip to main content
Statistics LibreTexts

7.5: End-of-Chapter Materials

  • Page ID
    57742
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Here are the expected materials to supplement the chapter.

    R Functions

    In this chapter, we were introduced to many, many, many R functions that will be useful in regression. In fact, this chapter uses more R functions than any other chapter in this book. Here are the many.


    Packages

    • car
      This package provides several statistical tests used in the book An R Companion to Applied Regression by J. Fox and S. Weisberg. It is a great package that provides a lot of additional functionality for R.
    • lawstat
      This package provides several statistical tests used in law and public policy analysis. It provides the basic runs.test function for us.
    • lmtest
      This package provides many tests related to linear models. It provides an implementation of the Breusch-Pagan test, bptest, which tests for heteroskedasticity in the residuals.
    • KnoxStats
      This package adds much general functionality to R. Specifically, it improves upon the runs test in the lawstat package.

    Statistics

    • source(filename)
      This function runs an R script from a separate file. That file may be local or on the Internet.
    • lm(formula)
      This is the function that performs ordinary least squares estimation on linear models.
    • aov(formula)
      This function performs ordinary least squares estimation on linear models.
    • summary(x)
      This produces the six-number summary or a frequency table of the provided variable, depending on the type of variable.
    • summary.lm(mod)
      When applied to a linear model fit using either the aov function or the lm function, provides estimates of the effects of the numeric variables and the levels of the categorical variables in the model.
    • summary.aov(mod)
      When applied to a linear model fit using either the aov function or the lm function, provides estimates of the statistical significance of the variables in the model.
    • shapiroTest(E)
      This tests the null hypothesis that the variable E comes from a Normal distribution. It is based on the shapiro.test function in base R installation. It adds capabilities to test Normality in several groups.
    • fligner.test(formula)
      This tests for heteroskedasticity when the independent variable is categorical.
    • bptest(mod)
      This function from the lmtest package performs the Bresuch-Pagan test for heteroskedasticity.
    • runs.test(E, order)
      This alteration to the lawstat function tests whether the variable E, as ordered by order exhibits fit issues.
    • vif(model)
      This function calculates the variance inflation factor (VIF) for each of the independent variables in the model.
    • predict(mod)
      This predicts the values of the dependent variable at each point in the dataset or for the values specified.
    • confint(mod)
      This calculates confidence intervals for the parameters in ordinary least squares regression.
    • set.base(var,level)
      This function redefines the base category in the provided level. By default, the base category is the first according to the alphabet.

    Probability

    • set.seed(x)
      This sets the random number seed. Doing so makes replication possible.
    • rexp(n, rate)
      This generates \(n\) random values from an Exponential distribution with the specified rate parameter.
    • rnorm(n, mean, sd)
      This generates \(n\) random values from a Normal distribution with specified mean and standard deviation. By default the mean is 0 and the standard deviation is 1.
    • runif(n, min, max)
      This generates \(n\) random values from a Uniform distribution with specified minimum and maximum values. By default, the minimum is 0 and the maximum is 1.

    Mathematics

    • head(x)
      This returns the first six values in the variable.
    • foot(x)
      This returns the last six values in the variable.
    • seq(from, to, by, length)
      This returns a vector of sequential values, where by indicates the step size and length specifies the vector length. Only one of these two should be provided. If neither is provided, then by defaults to 1.
    • length(x)
      This calculates the length of a vector (variable), which is the sample size, \(n\).
    • residuals(mod)
      This calculates the residuals in the model, which is the difference between the observed and the predicted.

    Graphics

    • qqnorm(x)
      This creates a Normal quantile-quantile plot for the given values.
    • qqline(x)
      This adds the diagonal line to the quantile-quantile plot.
    • plot(x,y)
      This produces a scatter plot of the y-values against the x-values.
    • overlay(x)
      This produces a histogram with a Normal curve overlaying it. Technically, there are several possible overlays, but the Normal curve is the default.
    • par(...)
      This sets parameters for the next graphic started. Look through the help page for this function to see all you can specify.
    • plot.new()
      Creates a blank, new plot.
    • plot.window(xlim, ylim)
      Specifies the limits for the x- and y-axes.
    • axis(side)
      When a plot is already drawn, this adds values along axis number side.
    • title(...)
      When a plot is already drawn, this adds the x- and y-labels.
    • lines(x,y)
      When a plot is already drawn, this draws lines between each subsequent (x, y) pair.
    • points(x,y)
      When a plot is already drawn, this draws points at each (x, y) pair.

    Programming

    • library(package)
      This loads an external package that you have already installed on your computer. It allows access to all functions and data sets in the package package.
    • attach(dataframe)
      This allows you to access the variables in the dataframe without having to prefix each with dataframe$.
    • as.character(x)
      This changes the values in variable x to be characters.
    • as.numeric(x)
      This changes the values in variable x to be numbers.

    Exercises

    1. In the two panels in Figure 7.3.2, the lines of best fit do not go beyond the data. Why?
    2. Section 7.3 ended by stating that there was a really big problem with those results. Run the following code.
      mean(outcome>1) + mean(outcome<0)
      What value is given, what does it mean, and why does it imply there is something fundamentally wrong with the analysis?

    This page titled 7.5: End-of-Chapter Materials is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Ole Forsberg.

    • Was this article helpful?