# 22.7: Categorical Analysis Beyond the 2 X 2 Table

Categorical analysis can also be applied to contingency tables where there are more than two categories for each variable.

For example, let’s look at the NHANES data and compare the variable Depressed which denotes the “self-reported number of days where participant felt down, depressed or hopeless”. This variable is coded as None, Several, or Most. Let’s test whether this variable is related to the SleepTrouble variable which reports whether the individual has reported sleeping problems to a doctor.

Table 22.5: Relationship between depression and sleep problems in the NHANES dataset
Depressed NoSleepTrouble YesSleepTrouble
None 2614 676
Several 418 249
Most 138 145

Simply by looking at these data, we can tell that it is likely that there is a relationship between the two variables; notably, while the total number of people with sleep trouble is much less than those without, for people who report being depresssed most days the number with sleep problems is greater than those without. We can quantify this directly using the chi-squared test; if our data frame contains only two variables, then we can simply provide the data frame as the argument to the chisq.test() function:

##
##  Pearson's Chi-squared test
##
## data:  depressedSleepTroubleTable
## X-squared = 191, df = 2, p-value <2e-16

This test shows that there is a strong relationship between depression and sleep trouble. We can also compute the Bayes factor to quantify the strength of the evidence in favor of the alternative hypothesis:

## Bayes factor analysis
## --------------
## [1] Non-indep. (a=1) : 1.8e+35 ±0%
##
## Against denominator:
##   Null, independence, a = 1
## ---
## Bayes factor type: BFcontingencyTable, joint multinomial

Here we see that the Bayes factor is very large, showing that the evidence in favor of a relation between depression and sleep problems is very strong.