Search

7.4: Plotting the Distribution of a Single Variable
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.04%3A_Plotting_the_Distribution_of_a_Single_Variable
We will use the built-in mpg dataset, which contains fuel efficiency data for a number of different cars. The histogram shows the ogerall distribution of the data. The default stat for geom_histogram ...We will use the built-in mpg dataset, which contains fuel efficiency data for a number of different cars. The histogram shows the ogerall distribution of the data. The default stat for geom_histogram is “count”. What do you think would happen if you overrode the default and set stat="count"? Note that the geometry tells ggplot what kind of plot to use, and the statistic (stat) tells it what kind of summary to present.
7.5: Plots with Two Variables
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.05%3A_Plots_with_Two_Variables
First, let’s make a bar plot by choosing the stat “summary” and picking the “mean” function to summarize the data. The upper and lower bounds of the box (the hinges) are the first and third quartiles ...First, let’s make a bar plot by choosing the stat “summary” and picking the “mean” function to summarize the data. The upper and lower bounds of the box (the hinges) are the first and third quartiles (can you use them to approximate the interquartile range?). Now, let’s do something a bit more complex, but much more useful – let’s create our own summary of the data, so we can choose which summary statistic to plot and also compute a measure of dispersion of our choosing.
7.6: Creating a More Complex Plot
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.06%3A_Creating_a_More_Complex_Plot
oringDf %>% ggplot(aes(x = Temperature, y = DamageIndex)) + geom_point() + geom_smooth(method = "loess", se = FALSE, span = 1) + ylim(0, 12) + geom_vline(xintercept = 27.5, size =8, alpha = 0.3, color...oringDf %>% ggplot(aes(x = Temperature, y = DamageIndex)) + geom_point() + geom_smooth(method = "loess", se = FALSE, span = 1) + ylim(0, 12) + geom_vline(xintercept = 27.5, size =8, alpha = 0.3, color = "red") + labs( y = "Damage Index", x = "Temperature at time of launch" ) + scale_x_continuous(breaks = seq.int(25, 85, 5)) + annotate( "text", angle=90, x = 27.5, y = 6, label = "Forecasted temperature on Jan 28", size = 5 )
7.2: Getting Started
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.02%3A_Getting_Started
Load ggplot and choose a theme you like (see here for examples). library(tidyverse) theme_set(theme_bw()) # I like this fairly minimal one
7.3: Let’s Think Through a Visualization
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.03%3A_Lets_Think_Through_a_Visualization
Principles we want to keep in mind: Use color, shape, and location to encourage comparisons Minimize visual clutter (maximize your information to ink ratio) The two questions you want to ask yourself ...Principles we want to keep in mind: Use color, shape, and location to encourage comparisons Minimize visual clutter (maximize your information to ink ratio) The two questions you want to ask yourself before getting started are: What type of variable(s) am I plotting? What comparison do I want to make salient for the viewer (possibly myself)? Figuring out how to highlight a comparison and include relevant variables usually benefits from sketching the plot out first.
7.1: The Grammar of Graphics
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.01%3A_The_Grammar_of_Graphics
Each language has a grammar consisting of types of words and the rules with which to string them together into sentences. The data are the actual variables we’re plotting, which we pass to ggplot thro...Each language has a grammar consisting of types of words and the rules with which to string them together into sentences. The data are the actual variables we’re plotting, which we pass to ggplot through the data argument. Now we need to tell ggplot how to plot those variables, by mapping each variable to an axis of the plot. The plot still had two axes – x and y – but we didn’t need to specify what went on the y axis because ggplot knew by default that it should make a count variable.
7.7: Additional Reading and Resources
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.07%3A_Additional_Reading_and_Resources
ggplot theme reference knockoff tech themes
7: Data Visualization with R
https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)
There are many different tools for plotting data in R, but we will focus on the ggplot() function provided by a package called ggplot2. ggplot is very powerful, but using it requires getting one’s hea...There are many different tools for plotting data in R, but we will focus on the ggplot() function provided by a package called ggplot2. ggplot is very powerful, but using it requires getting one’s head around how it works.

Search

Text Color

Text Size

Margin Size

Font Type

Support Center

How can we help?