5.6: Computing a Cumulative Distribution (Section 4.2.2)
- Page ID
- 8731
Let’s compute a cumulative distribution for the SleepHrsNight
variable in NHANES. This looks very similar to what we saw in the previous section.
# create summary table for relative frequency of different
# values of SleepHrsNight
SleepHrsNight_cumulative <-
NHANES_unique %>%
# drop NA values for SleepHrsNight variable
drop_na(SleepHrsNight) %>%
# remove other variables
dplyr::select(SleepHrsNight) %>%
# group by values
group_by(SleepHrsNight) %>%
# create summary table
summarize(AbsoluteFrequency = n()) %>%
# create relative and cumulative frequencies
mutate(
RelativeFrequency = AbsoluteFrequency / sum(AbsoluteFrequency),
CumulativeDensity = cumsum(RelativeFrequency)
)
kable(SleepHrsNight_cumulative)
SleepHrsNight | AbsoluteFrequency | RelativeFrequency | CumulativeDensity |
---|---|---|---|
2 | 9 | 0.00 | 0.00 |
3 | 49 | 0.01 | 0.01 |
4 | 200 | 0.04 | 0.05 |
5 | 406 | 0.08 | 0.13 |
6 | 1172 | 0.23 | 0.36 |
7 | 1394 | 0.28 | 0.64 |
8 | 1405 | 0.28 | 0.92 |
9 | 271 | 0.05 | 0.97 |
10 | 97 | 0.02 | 0.99 |
11 | 15 | 0.00 | 1.00 |
12 | 17 | 0.00 | 1.00 |