17.4: Body Counts
- Page ID
- 57797
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Using the above information, let us examine the problem of understanding terrorism. This extended example will also allow us to discuss a few things that are becoming important to our analyses, namely the bias-variance trade-off.
The Troubles in Ruritania lasted from 1969 until 2002. In that time, over 1800 people died as a result of terrorist actions — both republican and loyalist groups. Six prime ministers — both left- and right-leaning — had to deal with the terrorism. If we assume that the terrorist groups are rational actors, then they will act to maximize their chances of achieving their goals. Because of its hierarchical structure and large size, the Ruritanian Republican Army (RAvŘ) was best able to organize its actions to affect the elections.
The question is whether they did —
Did the RAvŘ react to the political ideology of the prime minister?
Unfortunately, the extant literature is divided on the direction of the effect. Some research suggests that the RAvŘ became more violent and killed more people when the Conservatives held power. Other research suggests that the RAvŘ became more violent under the Left party. Which is it?
The terrorism data set contains just three variables of import: total (the total number of deaths under that prime minister for the year, or part of the year), days (the number of days during the year that the prime minister was in power), and riteleft (the level of conservatism of the prime minister). The second variable is necessary to control for the fact that some prime ministers only ruled for a part of the year. The third variable is the research variable. The first variable is the response variable (dependent variable). The basic research model is
\[ \text{deaths} \sim \text{riteleft} \]
However, we need to deal with days, the number of days the premier is in power. If we include days as a simple independent variable, we allow the effects of the days variable to freely vary to fit the data.
It is almost always better to treat days as the divisor for terrorist killings, thus ostensibly creating a variable of killings per day. But, this is no longer a count model (non-integer values), nor is it a proportion model (values can be greater than one). What should we do?
Fear not! Through the magic of mathematics, we can handle it.
Recall in the section on the mathematics of count models that the link function we used was the logarithm: \(\log[\lambda] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\). If, instead of the expected count, \(\lambda\), we wanted to model the expected ratio, \(\lambda/\text{days}\), we would have:
\begin{align}
\log\left[\frac{\lambda}{\text{days}}\right] &= \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k \\[1em]
\log[\lambda] - \log[\text{days}] &= \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k \\[1em]
\log[\lambda] &= \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k + \log[\text{days}]
\end{align}
As such, we now have a count model (the \(\log[\lambda]\) is alone on the left and is a random variable) with an additional factor (\(\log[\text{days}]\) on the right as a non-random variable). Note that there is no parameter to estimate for \(\log[\text{days}]\). This is important in how we set up the model, as days is not a typical variable. Let us call it an offset variable.
Offset variables do not have parameters to estimate. They are direct effects with no multipliers. One can think of them as being subsumed in the constant term (which would be true if the offset variable were constant). Most statistical programs have an offset option available when you specify the model to be fit. In R, the offset is specified in the model call by the keyword offset.
For the glm function,
glm(pira~riteleft, offset=log(days), data=terror)
For glm.nb,
glm.nb(pira~riteleft, offset(log(days)), data=terror)
Option 1: Days as an independent variable
The first option is to treat the days variable as just another independent variable. This is not the best answer, as days has a specific meaning with respect to the number of terrorist deaths. The better option is to use Option 2 (below). However, for pedagogical purposes, let us first enter days as an independent variable. Performing regressions for each of the three count data families, we get the summarized results in the table below.
| Poisson | Quasi-Poisson | Negative Binomial | ||||
|---|---|---|---|---|---|---|
| Constant Term | 2.2622 | (0.1190) | 2.2622 | (0.7071) | 2.0363 | (0.6004) |
| Conservatism | -0.0115 | (0.0011) | -0.0115 | (0.0065) | -0.0142 | (0.0093) |
| Days Served | 0.0050 | (0.0004) | 0.0050 | (0.0021) | 0.0058 | (0.0019) |
| AIC | 1482.8 | ------ | 369.5 | |||
Note that the direction of each of the effects is the same. This is not always true, especially when the variable has little effect or has no statistical significance. However, if the variable is significant and changes effect direction, then there is something severely wrong with your research model. Also note that the effects are the same between the Poisson and the quasi-Poisson families. The only difference is the size of the standard errors. The quasi-Poisson will always give a better estimate of the standard errors (and of the statistical significance) than the Poisson.
Note that the Poisson model is severely overdispersed — the residual deviance is much larger than the residual degrees of freedom (the residual deviance is 1298, the residual degrees of freedom is 36, the overdispersion factor is 36.06). As such, the Poisson family would be (very) inappropriate for this model. Thus, either the quasi-Poisson or the Negative Binomial model would be preferable.
If we had just used the Poisson family, we would have concluded that the level of conservatism of the prime minister is highly significant. However, looking at the more-appropriate results of fitting using Maximum Quasi-Likelihood Estimation (or using the Negative Binomial family), we see that the effect of conservatism is non-existent. Since the effect of conservatism on deaths was the purpose of this research question, it is extremely important to reach good conclusions about the effects of this variable.
As our research variable is not statistically significant at the usual level of significance, we will not even bother to predict and graph our predictions here.
Option 2: Days as an offset variable
The second (and preferred) option uses days as an offset (or “exposure”) variable. This makes more sense than allowing it to freely enter the model as a typical independent variable. The results from fitting the data with the three model families are found in the table below.
| Poisson | Quasi-Poisson | Negative Binomial | ||||
|---|---|---|---|---|---|---|
| Constant Term | -1.8280 | (0.0254) | -1.8280 | (0.1495) | 3.8744 | (0.0969) |
| Conservatism | -0.0106 | (0.0011) | -0.0106 | (0.0063) | -0.0069 | (0.0041) |
| AIC | 1479.6 | ------ | 2080.2 | |||
According to the results, the Poisson family is not appropriate; the level of overdispersion is very high — on the order of 35. As such, using the MQLE method or the Negative Binomial family would make good substitutes. In the quasipoisson model, the parameter estimates remain the same, but the estimates of the standard errors change to reflect the overdispersion. Thus, while the effect of conservatism was statistically significant in the Poisson model, it was not in the quasipoisson model (\(p=0.1013\)).
The Negative Binomial model echoes the qualitative conclusions of the quasipoisson: The level of conservatism has no statistically discernible effect on the level of deaths resulting from RAvŘ terrorism in Ruritania during the Troubles (\(p=0.0905\)).
Bettering the Fit
Using the results from both the quasi-Poisson and the Negative Binomial model does offer you the ability to strengthen your conclusions. If one result gave statistical significance and the other did not, then you would realize that your conclusions depended on the assumptions you made about the underlying mechanism that produced the data, and not on the variables you chose to include (or exclude). It is never a good place to find yourself when your substantive results depend on the choice between two acceptable models.
Maybe, one should not stop here. Our formula is rather simplistic: it states that one independent variable is all we need to explain the dependent variable. It also assumes that the effect is linear between the independent and the dependent variable. If we believe that extremist prime ministers suffer from higher (or lower) levels of terrorist killings, then the research formula we have cannot capture that effect. To capture that effect, we will have to use the square (and/or higher powers) of the riteleft variable.
In fact, let us examine the effects of conservatism (up to the fourth power), plus the effects of having Labour in power, plus an interaction between having Labour in power and the level of conservatism in the Labour government. Thus, the research model we wish to fit will be
\begin{align}
\text{pira} = &\ \beta_0 + \beta_1\ \texttt{riteleft} + \beta_2\ \texttt{riteleft}^2 + \beta_3\ \texttt{riteleft}^3 \\
& + \beta_4\ \texttt{riteleft}^4 + \beta_5\ \texttt{labour} \\ & + \beta_6\ \texttt{labour} \times \texttt{riteleft}
\end{align}
Of course, we would need to have good theory to provide this model, but let's just have fun with this.
In most statistical programs, one would have to create new variables for each of the powers (three new variables) and a new variable for the interaction term (\(\texttt{labour} \times \texttt{riteleft}\)). In R, however, we can just write the formula to reflect what we want without having to worry about the additional step of creating new variables. As such, in R, the formula will be
\begin{align}
\texttt{pira} & \sim \texttt{riteleft + labour + I(riteleft}^2\texttt{) +} \\
& \qquad \texttt{I(riteleft}^3\texttt{) + I(riteleft}^4\texttt{) +} \\
& \qquad \texttt{I(labour*riteleft)}
\end{align}
The use of I() indicates that R should evaluate what is in the parentheses as a new variable. Fitting this model using Maximum Quasi-likelihood Estimation indicates that none of the terms have a statistically significant effect. This should not really surprise us, since there is a lot of correlation among the independent variables in that model. In the presence of high correlation, the standard errors tend to be larger than they should be.
| Quasi-Poisson | Negative Binomial | |
|---|---|---|
| Intercept | -12.51 (4.478) | -6.980 (2.396) |
| Labour | -4.742 (1.553) | -4.8430 (0.2856) |
| Conservatism | 1.847 (0.07101) | 1.8660 (0.3778) |
| Conservatism2 | -0.03830 (0.01425) | -0.03833 (0.00750) |
| Conservatism3 | -0.002585 (0.0009421) | -0.0026070 (0.0005005) |
| Conservatism4 | -0.00007314 (0.00002642) | -0.00007361 (0.00001398) |
Since nothing was statistically significant, let us pare the model to reduce the effect of correlation and get at some more basic effects. The best first thing to remove from the model is the interaction term. Doing this gives us the research model:
\begin{align}
\texttt{pira} = &\ \beta_0 + \beta_1\ \texttt{riteleft} + \beta_2\ \texttt{riteleft}^2 + \beta_3\ \texttt{riteleft}^3 \\
& + \beta_4\ \texttt{riteleft}^4 + \beta_5\ \texttt{labour} + \epsilon
\end{align}
Fitting this model using both the quasi-Poisson family and the Negative Binomial family gives us the results in Table \(\PageIndex{3}\) above.
Notice that all of our variables are now statistically significant at the \(\alpha=0.05\) level. It turns out that the interaction term was so highly correlated with the other variables that it made it impossible to correctly estimate the effects of the individual research variables.
Now that we have two models that tell us, substantively, the same story, we should show the effect of the variables of interest. There are really only two independent variables involved here, with one being dichotomous. As such, we can show the effects on the same graph (one graph for each family), with two prediction curves per graph. The figure below shows the predictions from both the quasi-Poisson model (Left Panel) and the Negative Binomial model (Right Panel). The upper curve in both cases (red) corresponds to predictions when the Conservatives are in power.



