3.2: The Linear Model Function

Last updated
Save as PDF

Page ID: 4410

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We use regression models to predict a system’s behavior by extrapolating from previously measured output values when the system is tested with known input parameter values. The simplest regression model is a straight line. It has the mathematical form:

y = a₀ + a₁x₁

where x₁ is the input to the system, a₀ is the y-intercept of the line, a₁ is the slope, and y is the output value the model predicts.

R provides the function lm() that generates a linear model from the data contained in a data frame. For this one-factor model, R computes the values of a₀ and a₁using the method of least squares. This method finds the line that most closely fits the measured data by minimizing the distances between the line and the individual data points. For the data frame int00.dat, we compute the model as follows:

> attach(int00.dat)
> int00.lm <lm(perf ~ clock)

The first line in this example attaches the int00.dat data frame to the current workspace. The next line calls the lm() function and assigns the resulting linear model object to the variable int00.lm. We use the suffix .lm to emphasize that this variable contains a linear model. The argument in the lm() function, (perf ~ clock), says that we want to find a model where the predictor clock explains the output perf.

Typing the variable’s name, int00.lm, by itself causes R to print the argument with which the function lm() was called, along with the computed coefficients for the regression model.

> int00.lm
Call:
lm(formula = perf ~ clock)
Coefficients:
(Intercept)        clock

51.7871            0.5863

In this case, the y-intercept is a₀ = 51.7871 and the slope is a₁ = 0.5863. Thus, the final regression model is:

Screen Shot 2020-01-08 at 11.25.27 AM.png — Figure 3.1.

perf = 51.7871 + 0.5863 ∗ clock.

The following code plots the original data along with the fitted line, as shown in Figure 3.2. The function abline() is short for (a,b)-line. It plots a line on the active plot window, using the slope and intercept of the linear model given in its argument.

> plot(clock,perf) 
> abline(int00.lm)