Skip to main content
Statistics LibreTexts

8.1: A.1- Starting...

  • Page ID
    3596
  • If you download your data file from Internet, go to the read.table() step. Otherwise, proceed as described.

    Create the working directory on the disk (using only lowercase English letters, numbers and underscore symbols for the name); inside working directory, create the directory data. Copy into it the data file with *.txt extension and Tab delimiter into it (this file could be made in Excel or similar via Save as...). Name file as bugs.txt.

    Open R. Using setwd() command (with the full path and / slashes as argument), change working directory to the directory where bugs.txt is located.

    To check location, type

    Code \(\PageIndex{1}\) (R):

    dir("data")

    ... and press ENTER key (press it on the end of every command). Among other, this command should output the name of file, bugs.txt.

    Now read the data file and create in R memory the object data which will be the working copy of the data file. Type:

    Code \(\PageIndex{2}\) (R):

    dir("data")
    data <- read.table("data/bugs.txt", h=TRUE)

    If you use online approach, replace data with URL (see the foreword).

    Look on the data file:

    Code \(\PageIndex{3}\) (R):

    dir("data")
    data <- read.table("data/bugs.txt", h=TRUE)
    head(data)

    Attention! If anything looks wrong, note that it is not quite handy to change data from inside R. The more sensible approach is to change the initial text file (for example, in Excel) and then read.table() it from disk again.

    Look on the data structure: how many characters (variables, columns), how many observations, what are names of characters and what is their type and order:

    Code \(\PageIndex{4}\) (R):

    data <- read.table("data/bugs.txt", h=TRUE)
    str(data)

    Please note that SEX and COLOR are represented with numbers whereas they are categorical variables.

    Create new object which contains data only about females (SEX is 0):

    Code \(\PageIndex{5}\) (R):

    data <- read.table("data/bugs.txt", h=TRUE)
    data.f <- data[data$SEX == 0, ]

    Now—the object containing data about big (more than 10 mm) males:

    Code \(\PageIndex{6}\) (R):

    data <- read.table("data/bugs.txt", h=TRUE)
    data.m.big <- data[data$SEX == 1 & data$LENGTH > 10, ]

    By the way, this command is easier not to type but create from the previous command (this way is preferable in R). To repeat the previous command, press “\(\uparrow\)” key on the keyboard.

    ==” and “&” are logical statements “equal to” and “and”, respectively. They were used for data selection. Selection also requires square brackets, and if the data is tabular (like our data), there should be a comma inside square brackets which separates statements about rows from statements concerning columns.

    Add new character (columns) to the data file: the relative weight of bug (the ratio between weight and length)— WEIGHT.R:

    Code \(\PageIndex{7}\) (R):

    data <- read.table("data/bugs.txt", h=TRUE)
    data$WEIGHT.R <- data$WEIGHT/data$LENGTH

    Check new character using str() (use “\(\uparrow\)”!)

    This new character was added only to the memory copy of your data file. It will disappear when you close R. You may want to save new version of the data file under the new name bugs_new.txt in your data subdirectory:

    Code \(\PageIndex{8}\) (R):

    data <- read.table("data/bugs.txt", h=TRUE)
    write.table(data, file="data/bugs_new.txt", quote=FALSE)
    • Was this article helpful?