Search
- Filter Results
- Location
- Classification
- Include attachments
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.04%3A_Plotting_the_Distribution_of_a_Single_VariableWe will use the built-in mpg dataset, which contains fuel efficiency data for a number of different cars. The histogram shows the ogerall distribution of the data. The default stat for geom_histogram ...We will use the built-in mpg dataset, which contains fuel efficiency data for a number of different cars. The histogram shows the ogerall distribution of the data. The default stat for geom_histogram is “count”. What do you think would happen if you overrode the default and set stat="count"? Note that the geometry tells ggplot what kind of plot to use, and the statistic (stat) tells it what kind of summary to present.
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.05%3A_Plots_with_Two_VariablesFirst, let’s make a bar plot by choosing the stat “summary” and picking the “mean” function to summarize the data. The upper and lower bounds of the box (the hinges) are the first and third quartiles ...First, let’s make a bar plot by choosing the stat “summary” and picking the “mean” function to summarize the data. The upper and lower bounds of the box (the hinges) are the first and third quartiles (can you use them to approximate the interquartile range?). Now, let’s do something a bit more complex, but much more useful – let’s create our own summary of the data, so we can choose which summary statistic to plot and also compute a measure of dispersion of our choosing.
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.06%3A_Creating_a_More_Complex_PlotoringDf %>% ggplot(aes(x = Temperature, y = DamageIndex)) + geom_point() + geom_smooth(method = "loess", se = FALSE, span = 1) + ylim(0, 12) + geom_vline(xintercept = 27.5, size =8, alpha = 0.3, color...oringDf %>% ggplot(aes(x = Temperature, y = DamageIndex)) + geom_point() + geom_smooth(method = "loess", se = FALSE, span = 1) + ylim(0, 12) + geom_vline(xintercept = 27.5, size =8, alpha = 0.3, color = "red") + labs( y = "Damage Index", x = "Temperature at time of launch" ) + scale_x_continuous(breaks = seq.int(25, 85, 5)) + annotate( "text", angle=90, x = 27.5, y = 6, label = "Forecasted temperature on Jan 28", size = 5 )
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.02%3A_Getting_StartedLoad ggplot and choose a theme you like (see here for examples). library(tidyverse) theme_set(theme_bw()) # I like this fairly minimal one
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.03%3A_Lets_Think_Through_a_VisualizationPrinciples we want to keep in mind: Use color, shape, and location to encourage comparisons Minimize visual clutter (maximize your information to ink ratio) The two questions you want to ask yourself ...Principles we want to keep in mind: Use color, shape, and location to encourage comparisons Minimize visual clutter (maximize your information to ink ratio) The two questions you want to ask yourself before getting started are: What type of variable(s) am I plotting? What comparison do I want to make salient for the viewer (possibly myself)? Figuring out how to highlight a comparison and include relevant variables usually benefits from sketching the plot out first.
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.01%3A_The_Grammar_of_GraphicsEach language has a grammar consisting of types of words and the rules with which to string them together into sentences. The data are the actual variables we’re plotting, which we pass to ggplot thro...Each language has a grammar consisting of types of words and the rules with which to string them together into sentences. The data are the actual variables we’re plotting, which we pass to ggplot through the data argument. Now we need to tell ggplot how to plot those variables, by mapping each variable to an axis of the plot. The plot still had two axes – x and y – but we didn’t need to specify what went on the y axis because ggplot knew by default that it should make a count variable.
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)/7.07%3A_Additional_Reading_and_Resourcesggplot theme reference knockoff tech themes
- https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Statistical_Thinking_for_the_21st_Century_(Poldrack)/07%3A_Data_Visualization_with_R_(with_Anna_Khazenzon)There are many different tools for plotting data in R, but we will focus on the ggplot() function provided by a package called ggplot2. ggplot is very powerful, but using it requires getting one’s hea...There are many different tools for plotting data in R, but we will focus on the ggplot() function provided by a package called ggplot2. ggplot is very powerful, but using it requires getting one’s head around how it works.