Skip to main content
Statistics LibreTexts

2.10: Working with data files

  • Page ID
    7732
  • When we are doing statistics, we often need to load in the data that we will analyze. Those data will live in a file on one’s computer or on the internet. For this example, let’s use a file that is hosted on the internet, which contains the gross domestic product (GDP) values for a number of countries around the world. This file is stored as comma-delimited text, meaning that the values for each of the variables in the dataset are separate by commas. There are three variables: the relative rank of the countries, the name of the country, and its GDP value. Here is what the first few lines of the file look like:

    Rank,Country,GDP
    1,Liechtenstein,141100
    2,Qatar,104300
    3,Luxembourg,81100
    

    We can load a comma-delimited text file into R using the read.csv() function, which will accept either the location of a file on one’s computer, or a URL for files that are located on the web:

    url='https://raw.githubusercontent.com/psych10/
    psych10/master/notebooks/Session03-IntroToR/gdp.csv'
    gdp_df <- read.csv(url)

    Once you have done this, take a look at the data frame using the View() function, and make sure that it looks right — it should have a column for each of the three variables.

    Let’s say that we wanted to create a new file, which contained GDP values in Euros rather than US Dollars. We use today’s exchange rate, which is 1 USD == 0.90 Euros. To convert from Dollars to Euros, we simply multiple the GDP values by the exchange rate, and assign those values to a new variable within the data frame:

    > exchange_rate = 0.9
    > gdp_df$GDP_euros <- gdp_df$GDP * exchange_rate
    

    You should now see a new variable within the data frame, called “GDP_euros” which contains the new values. Now let’s save this to a comma-delimited text file on our computer called “gdp_euro.csv”. We do this using the write.table() command.

    > write.table(gdp_df, file='gdp_euro.csv')
    

    This file will be created with the working directory that RStudio is using. You can find this directory using the getwd() function:

    > getwd()
    [1] "/Users/me/MyClasses/Psych10/LearningR"