3.10: Working with Data Files
- Page ID
When we are doing statistics, we often need to load in the data that we will analyze. Those data will live in a file on one’s computer or on the internet. For this example, let’s use a file that is hosted on the internet, which contains the gross domestic product (GDP) values for a number of countries around the world. This file is stored as comma-delimited text, meaning that the values for each of the variables in the dataset are separate by commas. There are three variables: the relative rank of the countries, the name of the country, and its GDP value. Here is what the first few lines of the file look like:
Rank,Country,GDP 1,Liechtenstein,141100 2,Qatar,104300 3,Luxembourg,81100
We can load a comma-delimited text file into R using the
read.csv() function, which will accept either the location of a file on one’s computer, or a URL for files that are located on the web:
url='https://raw.githubusercontent.com/psych10/ psych10/master/notebooks/Session03-IntroToR/gdp.csv' gdp_df <- read.csv(url)
Once you have done this, take a look at the data frame using the
View() function, and make sure that it looks right — it should have a column for each of the three variables.
Let’s say that we wanted to create a new file, which contained GDP values in Euros rather than US Dollars. We use today’s exchange rate, which is 1 USD == 0.90 Euros. To convert from Dollars to Euros, we simply multiple the GDP values by the exchange rate, and assign those values to a new variable within the data frame:
> exchange_rate = 0.9 > gdp_df$GDP_euros <- gdp_df$GDP * exchange_rate
You should now see a new variable within the data frame, called “GDP_euros” which contains the new values. Now let’s save this to a comma-delimited text file on our computer called “gdp_euro.csv”. We do this using the
> write.table(gdp_df, file='gdp_euro.csv')
This file will be created with the working directory that RStudio is using. You can find this directory using the
> getwd()  "/Users/me/MyClasses/Psych10/LearningR"