8.5: End-of-Chapter Materials
- Page ID
- 57748
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Here are the expected materials to supplement the chapter.
R Functions
In this chapter, we used several R functions that will be useful in the future. These are listed here.
Packages
KnoxStats
This package adds much general functionality to R.
Statistics
lm(formula)
This function performs linear modeling on the data, with the supplied formula. As there is much information contained in this function, you will want to save the results in a variable, to retrieve the information through the \function{summary} and \function{names} functions.predict(model)
Thepredictfunction calculates the value of the dependent variable in themodelgiven the independent variables used to create the model. If new predictions are required, thenewdata=parameter must be used. This parameter takes a new set of data as its argument. Make sure that all independent variables used in the model are defined in thenewdata=parameter. If not, an error message will result. Finally, specifying these.fit=TRUEoption calculates the standard error at each prediction point.summaryHCE(model)
This function, a part of theKnoxStatspackage, allows us to easily calculate the heteroskedastic-consistent standard errors (White 1980).
Probability
pnorm(x, m, s)
This function is the cumulative distribution function (CDF) for the Normal distribution. It returns a probability that a Normally-distributed variable will be less than or equal tox. This function has two additional parameters that remove the requirement thatxhas undergone the z-transformation,mands.rnorm(n, m, s)
This function returnsndraws from a Normal distribution centered atmand with a standard deviations. This function is the cornerstone of much Monte Carlo analysis.
Graphics
lines(x,y)
This is an extremely handy line-generating function, painting a line on the current plot (or returns an error if no plot exists). It first invisibly plots the pairs of points (x, y) then connects the points with drawn line segments. If thecolparameter is not set, then the line will be black. Otherwise, the line will be the color specified. There are at least three ways of stating the color: using the Windows 1-16 values, using names, and using the rgb values. The following all refer to "red":col=2,col="red", andcol="#ff0000".plot()
This function produces a scatterplot of the two-dimensional data. The call can be eitherplot(x,y)orplot(y~x); both give identical results. This function can produce graphs that are very customized. The R help file forparis invaluable. Some important parameters includexlab(label for the x-axis),ylab(label for the y-axis),xlimandylim(axis limits, min and max, for the x- and y-axis), andlas=1(makes axis values painted horizontal).
Mathematics
log(x, b)
This returns the logarithm ofx, with a base ofb. If you omit theb, this function returns the natural logarithm ofx. To calculate the common logarithm, setb=10. The logarithm function is used to transform variables bounded on one side into variables bounded on neither side.exp(x)
This function returns the exponential of the argument,x; that is, it returns \(e^x\). The exponential function is the inverse of the logarithm function.logit(x)
This function returns the logit of the provided number. This number must be between 0 and 1, not including either 0 or 1. The logit function is frequently used to transform proportions into unbounded data.logistic(x)
This function returns the logistic of a given number. The range of the logistic function is 0 to 1, exclusive. it is the inverse of the logit function. As such, it is often used to transform predictions from logit units to proportion units.cloglog(x)
The complementary log-log function is a second appropriate transformation for proportion data. It is, however, not a symmetric function.cloglog.inv(x)
This function is the inverse of the complementary log-log function.
Programming
which(condition)
This function returns a vector of indices corresponding to the original vector's values meeting the criteria. Thus,which(x==4)returns the indices of all elements in vectorxthat equal 4. Note that equality is checked with a double equals,==. Other comparisons include:>,<,>=,<=,!=,&,|, and!. The last four are "not equal to," "and," "or," and "not."
Exercises
This section offers suggestions on things you can practice from this chapter.
- Predict the Venkovský 1994 cow ballot measure vote using the transformed vote model. Is this prediction physically possible?
- Determine a 95% confidence interval, with the untransformed cow vote model, for predicting Děčín's vote. Is the actual outcome within the 95% confidence interval?
- Determine a 95% confidence interval, with the transformed cow vote model, for predicting Děčín's vote. Is the actual outcome within the 95% confidence interval?
- Determine if the assumptions of OLS are violated in the transformed cow vote model.
- The actual vote share for Děčín was 52.8%. Explain why both models failed in predicting the actual vote outcome. How bad was the error? What can be done to improve the predictions?
- The logit transformation is not the only possible choice. There is also the asymmetric complementary log-log transformation (
cloglog). Use this function as the transformation to predict Děčín's vote, its 95% confidence interval, and the probability of the cow ballot measure passing. The inverse of the complementary log-log transform has no name, but the R function iscloglog.inv. - Estimate the GDP per capita for Papua New Guinea using the untransformed model, as well as the 95% confidence interval. How close is this estimate to the real answer, and is the real answer within the predicted confidence interval?
Applied Readings
- James M. Avery (2009). "Political Mistrust among African Americans and Support for the Political System." Political Research Quarterly. 62(1): 132–45.
- Mark Andreas Kayser (2009). "Partisan Waves: International Business Cycles and Electoral Choice." American Journal of Political Science. 53(4): 950–70.
- Pamela A. Morris (2008). "Welfare Program Implementation and Parents' Depression." The Social Service Review. 82(4): 579–614.
- Kar Tean Tan, Christopher C. White, and Donald L. Hunston (2011). "An adhesion test method for spray-applied fire-resistive materials." Fire and Materials. 35(4): 245–59.
Theory Readings
- George Casella and Roger L. Berger (2001). Statistical Inference. New York: Duxbury Press.
- Annette J. Dobson and Adrian Barnett (2008). An Introduction to Generalized Linear Models, Third Edition. New York: Chapman & Hall.
- Friedhelm Eicker (1967). "Limit Theorems for Regression with Unequal and Dependent Errors." Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability: 59–82.
- Julian J. Faraway (2004). Linear Models with R. New York: Chapman & Hall.
- Julian J. Faraway (2005). Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. New York: Chapman & Hall.
- Peter J. Huber (1967). "The Behavior of Maximum Likelihood Estimates under Nonstandard Conditions." Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability: 221–233.
- John A. Nelder and Robert W. M. Wedderburn (1972). "Generalized Linear Models." Journal of the Royal Statistical Society. Series A (General) 135(3): 370–84.
- Henry Scheffé (1959). The Analysis of Variance. New York: Wiley.
- Shayle R. Searle (1997). Linear Models. New York: Wiley-Interscience.
- James H. Stapleton (2009). Linear Statistical Models. New York: John Wiley and Sons.
- Robert S. Stritchartz (2000). The Way of Analysis, Revised Edition. Boston: Jones and Bartlett Mathematics.
- Halbert White (1980). "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity." Econometrica. 48(4): 817–838.
- Simon N. Wood (2006). Generalized Additive Models: An Introduction with R. New York: Chapman & Hall.
- Holbrook Working and Harold Hotelling (1929). "Applications of the Theory of Error to the Interpretation of Trends." Journal of the American Statistical Association 24(1): 73–85.


