16.7: End-of-Chapter Materials
- Page ID
- 57792
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)R Functions
In this chapter, we were introduced to several R functions that will be useful in the future. These are listed here.
Packages
KnoxStats
This package adds much general functionality to R.VGAM
This package allows one to model dependent variables that are conditionally Beta-Binomial distributed. As it is not a part of the base installation for \R, you will need to install it before you can load it withlibrary(VGAM).
Statistics
lm(formula)
This function performs linear regression on the data, with the supplied formula. As there is much information contained in this function, you will want to save the results in a variable.glm(formula)
This function performs generalized linear model estimation on the given formula. There are three additional parameters that can (and often should) be specified.- The
familyparameter specifies the distributional family of the dependent variable. This chapter coveredbinomialandquasibinomial. If this parameter is not specified, R assumes \var{gaussian}. - The
linkparameter specifies the link function for the distribution. If none is specified, the canonical link is assumed. - Finally, the
dataparameter specifies the data from which the formula variables come. This is the same parameter as in thelm()function.
- The
vglm(formula)
This function performs vector generalized linear model estimation on the given formula. As with theglm()function, there are additional parameters that can (and often should) be specified. Note that thelinkoption is not allowed invglmmodels at this point.predict(mod, newdata)
As with almost all statistical packages, R has a predict function. It takes two parameters, the model, and a dataframe of the independent values from which you want to predict. If you omitnewdata, then it will predict based on the independent variables of the data itself, which can be used to calculate residuals. The dataframe must list all independent variables with their associate new values. You can specify multiple new values for a single independent variable.AIC(mod)
This function calculates the Akaike Informations Criterion score for the provided model. The model needs to have been fit using Maximum Likelihood Estimation.BIC(mod)
This function calculates Schwarz's Bayesian Information Criterion (BIC) for the provided model.deviance(mod)
This function returns the deviance in the model. This value is useful in the Likelihood Ratio Test.deviance.test(mod)
This function performs a chi-square test for whether the deviance differs from unity.pchisq(x)
This gives the value of the cumulative distribution function (CDF) under the Chi-squared distribution. The necessary parameter is the number of degrees of freedom,df=. By default, it returns the lower-tail probability. Usually, we will want to have the upper-tail probability, thus we will use thelower.tail=FALSEparameter.
Programming
for
This command is one of the basic control-constructs in the R language (as in most programming languages). The usual use is
for(var in seq) { expr }
wherevaris the looping variable (the variable that equals the current loop number).
Exercises
This section offers suggestions on things you can practice from this chapter.
- Show that the Binomial distribution is a member of the Exponential Class of distributions.
- Let \(Y \sim Binom(n, \pi)\). Calculate \(E[Y]\) and \(V[Y]\) using the pmf in Exponential Class form (see Section 14.2: The Requirements for GLMs).
- Explain why a negative (and statistically significant) slope indicates that the differential invalidation was in favor of that candidate (Section 16.4: Sri Lanka in 2010).
- Refit the Sri Lanka election data using the Beta-Binomial distribution. Comment on any differences.
Theory Readings
- Alan Agresti and Brent A. Coull (1998). "Approximate is better than 'exact' for interval estimation of binomial proportions." The American Statistician. 52(2): 119–126.
doi: 10.2307/2685469. - Peter McCullagh and John A. Nelder (1989). Generalized Linear Models. London: Chapman and Hall.
- John A. Nelder and Robert W. Wedderburn (1972). "Generalized Linear Models." Journal of the Royal Statistical Society, Series A (General). 135(3): 370–84.
- Thomas W. Yee (2010). "The VGAM Package for Categorical Data Analysis." Journal of Statistical Software. 32(10), 1-34.
doi: 10.18637/jss.v032.i10. - Thomas W. Yee (2015). Vector Generalized Linear and Additive Models: With an implementation in R. New York, USA: Springer.
- Thomas W. Yee and C. J. Wild (1996). "Vector Generalized Additive Models." Journal of Royal Statistical Society, Series B, 58(3), 481–493.


