Skip to main content
Statistics LibreTexts

5.2: Creating or Modifying Variables Using Mutate()

  • Page ID
    8727
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Often we will want to either create a new variable based on an existing variable, or modify the value of an existing variable. Within the tidyverse, we do this using a function called mutate(). Let’s start with a toy example by creating a data frame containing a single variable.

    toy_df <- data.frame(x = c(1,2,3,4))
    glimpse(toy_df)
    ## Observations: 4
    ## Variables: 1
    ## $ x <dbl> 1, 2, 3, 4

    Let’s say that we wanted to create a new variable called y that would contain the value of x multiplied by 10. We could do this using mutate() and then assign the result back to the same data frame:

    toy_df <- toy_df %>%
      # create a new variable called y that contains x*10
      mutate(y = x*10)
    glimpse(toy_df)
    ## Observations: 4
    ## Variables: 2
    ## $ x <dbl> 1, 2, 3, 4
    ## $ y <dbl> 10, 20, 30, 40

    We could also overwrite a variable with a new value:

    toy_df2 <- toy_df %>%
      # create a new variable called y that contains x*10
      mutate(y = y + 1)
    glimpse(toy_df2)
    ## Observations: 4
    ## Variables: 2
    ## $ x <dbl> 1, 2, 3, 4
    ## $ y <dbl> 11, 21, 31, 41

    We will use mutate() often so it’s an important function to understand.

    Here we can use it with our example data frame to create a new variable that is the sum of several other variables.

    myDataFrame <- 
      myDataFrame %>%
      mutate(total = x + y + z)
    
    kable(myDataFrame)
    n x y z total
    russ 1 4 7 12
    lucy 2 5 8 15
    jaclyn 3 6 9 18
    tyler 4 7 10 21

    mutate() is a function that creates a new variable in a data frame using the existing variables. In this case, it creates a variable called total that is the sum of the existing variables x, y, and z.

    5.2.1 Remove a column using the select() function

    Adding a minus sign to the name of a variable within the select() command will remove that variable, leaving all of the others.

    myDataFrame <- 
      myDataFrame %>%
      dplyr::select(-total)
    
    kable(myDataFrame)
    n x y z
    russ 1 4 7
    lucy 2 5 8
    jaclyn 3 6 9
    tyler 4 7 10

    This page titled 5.2: Creating or Modifying Variables Using Mutate() is shared under a not declared license and was authored, remixed, and/or curated by Russell A. Poldrack via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.