9.3: Mode

The mode is the most frequent value that occurs in a variable. R has a function called mode() but if you look at the help page you will see that it doesn’t actually copute the mode. In fact, R doesn’t have a built-in function to compute the mode, so we need to create one. Let start with some toy data:

mode_test = c('a', 'b', 'b', 'c', 'c', 'c')
mode_test
## [1] "a" "b" "b" "c" "c" "c"

We can see by eye that the mode is “a” since it occurs more often than the others. To find it computationally, let’s first get the unique values

To do this, we first create a table with the counts for each value, using the table() function:

mode_table <- table(mode_test)
mode_table
## mode_test
## a b c
## 1 2 3

Now we need to find the maximum value. We do this by comparing each value to the maximum of the table; this will work even if there are multiple values with the same frequency (i.e. a tie for the mode).

table_max <- mode_table[mode_table == max(mode_table)]
table_max
## c
## 3

This variable is a special kind of value called a named vector, and its name contains the value that we need to identify the mode. We can pull it out using the names() function:

my_mode <- names(table_max)[1]
my_mode
## [1] "c"

Let’s wrap this up into our own custom function:

getmode <- function(v, print_table=FALSE) {
mode_table <- table(v)
if (print_table){
print(kable(mode_table))
}
table_max <- mode_table[mode_table == max(mode_table)]
return(names(table_max))
}

We can then apply this to real data. Let’s apply this to the MaritalStatus variable in the NHANES dataset:

getmode(NHANES\$MaritalStatus)
## [1] "Married"