# 3.7: Storing Many Numbers As a Vector

- Page ID
- 8109

At this point we’ve covered functions in enough detail to get us safely through the next couple of chapters (with one small exception: see Section 4.11, so let’s return to our discussion of variables. When I introduced variables in Section3.4 I showed you how we can use variables to store a single number. In this section, we’ll extend this idea and look at how to store multiple numbers within the one variable. In R, the name for a variable that can store multiple values is a **vector**. So let’s create one.

# 3.7.1 Creating a vector

Let’s stick to my silly “get rich quick by textbook writing” example. Suppose the textbook company (if I actually had one, that is) sends me sales data on a monthly basis. Since my class start in late February, we might expect most of the sales to occur towards the start of the year. Let’s suppose that I have 100 sales in February, 200 sales in March and 50 sales in April, and no other sales for the rest of the year. What I would like to do is have a variable – let’s call it `sales.by.month`

– that stores all this sales data. The first number stored should be `0`

since I had no sales in January, the second should be `100`

, and so on. The simplest way to do this in R is to use the **combine** function,

`c()`

. To do so, all we have to do is type all the numbers you want to store in a comma separated list, like this:^{35}

`sales.by.month <`**c**(0, 100, 200, 50, 0, 0, 0, 0, 0, 0, 0, 0)
sales.by.month

`## [1] 0 100 200 50 0 0 0 0 0 0 0 0`

To use the correct terminology here, we have a single variable here called `sales.by.month`

: this variable is a vector that consists of 12 **elements**.

# 3.7.2 A handy digression

Now that we’ve learned how to put information into a vector, the next thing to understand is how to pull that information back out again. However, before I do so it’s worth taking a slight detour. If you’ve been following along, typing all the commands into R yourself, it’s possible that the output that you saw when we printed out the `sales.by.month`

vector was slightly different to what I showed above. This would have happened if the window (or the RStudio panel) that contains the R console is really, really narrow. If that were the case, you might have seen output that looks something like this:

`sales.by.month`

`## [1] 0 100 200 50 0 0 0 0 0 0 0 0`

Because there wasn’t much room on the screen, R has printed out the results over two lines. But that’s not the important thing to notice. The important point is that the first line has a `[1]`

in front of it, whereas the second line starts with `[9]`

. It’s pretty clear what’s happening here. For the first row, R has printed out the 1st element through to the 8th element, so it starts that row with a `[1]`

. For the second row, R has printed out the 9th element of the vector through to the 12th one, and so it begins that row with a `[9]`

so that you can tell where it’s up to at a glance. It might seem a bit odd to you that R does this, but in some ways it’s a kindness, especially when dealing with larger data sets!

# 3.7.3 Getting information out of vectors

To get back to the main story, let’s consider the problem of how to get information out of a vector. At this point, you might have a sneaking suspicion that the answer has something to do with the `[1]`

and `[9]`

things that R has been printing out. And of course you are correct. Suppose I want to pull out the February sales data only. February is the second month of the year, so let’s try this:

`sales.by.month[2]`

`## [1] 100`

Yep, that’s the February sales all right. But there’s a subtle detail to be aware of here: notice that R outputs `[1] 100`

, *not*`[2] 100`

. This is because R is being extremely literal. When we typed in `sales.by.month[2]`

, we asked R to find exactly * one* thing, and that one thing happens to be the second element of our

`sales.by.month`

vector. So, when it outputs `[1] 100`

what R is saying is that the first number *is*

*that we just asked for*`100`

. This behaviour makes more sense when you realise that we can use this trick to create new variables. For example, I could create a `february.sales`

variable like this:```
february.sales <sales.by.month[2]
february.sales
```

`## [1] 100`

Obviously, the new variable `february.sales`

should only have one element and so when I print it out this new variable, the R output begins with a `[1]`

because `100`

is the value of the first (and only) element of `february.sales`

. The fact that this also happens to be the value of the second element of `sales.by.month`

is irrelevant. We’ll pick this topic up again shortly (Section3.10.

# 3.7.4 Altering the elements of a vector

Sometimes you’ll want to change the values stored in a vector. Imagine my surprise when the publisher rings me up to tell me that the sales data for May are wrong. There were actually an additional 25 books sold in May, but there was an error or something so they hadn’t told me about it. How can I fix my `sales.by.month`

variable? One possibility would be to assign the whole vector again from the beginning, using `c()`

. But that’s a lot of typing. Also, it’s a little wasteful: why should R have to redefine the sales figures for all 12 months, when only the 5th one is wrong? Fortunately, we can tell R to change only the 5th element, using this trick:

```
sales.by.month[5] <25
sales.by.month
```

`## [1] 0 100 200 50 25 0 0 0 0 0 0 0`

Another way to edit variables is to use the `edit()`

or `fix()`

functions. I won’t discuss them in detail right now, but you can check them out on your own.

# 3.7.5 Useful things to know about vectors

Before moving on, I want to mention a couple of other things about vectors. Firstly, you often find yourself wanting to know how many elements there are in a vector (usually because you’ve forgotten). You can use the `length()`

function to do this. It’s quite straightforward:

**length**( x = sales.by.month )

`## [1] 12`

Secondly, you often want to alter all of the elements of a vector at once. For instance, suppose I wanted to figure out how much money I made in each month. Since I’m earning an exciting $7 per book (no seriously, that’s actually pretty close to what authors get on the very expensive textbooks that you’re expected to purchase), what I want to do is multiply each element in the `sales.by.month`

vector by `7`

. R makes this pretty easy, as the following example shows:

`sales.by.month * 7`

`## [1] 0 700 1400 350 175 0 0 0 0 0 0 0`

In other words, when you multiply a vector by a single number, all elements in the vector get multiplied. The same is true for addition, subtraction, division and taking powers. So that’s neat. On the other hand, suppose I wanted to know how much money I was making per day, rather than per month. Since not every month has the same number of days, I need to do something slightly different. Firstly, I’ll create two new vectors:

`days.per.month <`**c**(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
profit <sales.by.month * 7

Obviously, the `profit`

variable is the same one we created earlier, and the `days.per.month`

variable is pretty straightforward. What I want to do is divide every element of `profit`

by the * corresponding* element of

`days.per.month`

. Again, R makes this pretty easy:`profit / days.per.month`

```
## [1] 0.000000 25.000000 45.161290 11.666667 5.645161 0.000000 0.000000
## [8] 0.000000 0.000000 0.000000 0.000000 0.000000
```

I still don’t like all those zeros, but that’s not what matters here. Notice that the second element of the output is 25, because R has divided the second element of `profit`

(i.e. 700) by the second element of `days.per.month`

(i.e. 28). Similarly, the third element of the output is equal to 1400 divided by 31, and so on. We’ll talk more about calculations involving vectors later on (and in particular a thing called the “recycling rule”; Section 7.12.2, but that’s enough detail for now.