3.5: Using Functions to Do Calculations
The symbols
+
,
-
,
*
and so on are examples of operators. As we’ve seen, you can do quite a lot of calculations just by using these operators. However, in order to do more advanced calculations (and later on, to do actual statistics), you’re going to need to start using
functions
.
29
. Thus
10+20
is equivalent to the function call
+(20, 30)
. Not surprisingly, no-one ever uses this version. Because that would be stupid.] I’ll talk in more detail about functions and how they work in Section 8.4, but for now let’s just dive in and use a few. To get started, suppose I wanted to take the square root of 225. The square root, in case your high school maths is a bit rusty, is just the opposite of squaring a number. So, for instance, since “5 squared is 25” I can say that “5 is the square root of 25”. The usual notation for this is
√25=5
though sometimes you’ll also see it written like this 25 0.5 =5. This second way of writing it is kind of useful to “remind” you of the mathematical fact that “square root of x” is actually the same as “raising x to the power of 0.5”. Personally, I’ve never found this to be terribly meaningful psychologically, though I have to admit it’s quite convenient mathematically. Anyway, it’s not important. What is important is that you remember what a square root is, since we’re going to need it later on.
To calculate the square root of 25, I can do it in my head pretty easily, since I memorised my multiplication tables when I was a kid. It gets harder when the numbers get bigger, and pretty much impossible if they’re not whole numbers. This is where something like R comes in very handy. Let’s say I wanted to calculate √225, the square root of 225. There’s two ways I could do this using R. Firstly, since the square root of 255 is the same thing as raising 225 to the power of 0.5, I could use the power operator
^
, just like we did earlier:
225 ^ 0.5
## [1] 15
However, there’s a second way that we can do this, since R also provides a
square root function
,
sqrt()
. To calculate the square root of 255 using this function, what I do is insert the number
225
in the parentheses. That is, the command I type is this:
sqrt( 225 )
## [1] 15
and as you might expect from our previous discussion, the spaces in between the parentheses are purely cosmetic. I could have typed
sqrt(225)
or
sqrt( 225 )
and gotten the same result. When we use a function to do something, we generally refer to this as
calling
the function, and the values that we type into the function (there can be more than one) are referred to as the
arguments
of that function.
Obviously, the
sqrt()
function doesn’t really give us any new functionality, since we already knew how to do square root calculations by using the power operator
^
, though I do think it looks nicer when we use
sqrt()
. However, there are lots of other functions in R: in fact, almost everything of interest that I’ll talk about in this book is an R function of some kind. For example, one function that we will need to use in this book is the
absolute value function
. Compared to the square root function, it’s extremely simple: it just converts negative numbers to positive numbers, and leaves positive numbers alone. Mathematically, the absolute value of x is written |x| or sometimes abs(x). Calculating absolute values in R is pretty easy, since R provides the
abs()
function that you can use for this purpose. When you feed it a positive number…
abs( 21 )
## [1] 21
the absolute value function does nothing to it at all. But when you feed it a negative number, it spits out the positive version of the same number, like this:
abs( -13 )
## [1] 13
In all honesty, there’s nothing that the absolute value function does that you couldn’t do just by looking at the number and erasing the minus sign if there is one. However, there’s a few places later in the book where we have to use absolute values, so I thought it might be a good idea to explain the meaning of the term early on.
Before moving on, it’s worth noting that – in the same way that R allows us to put multiple operations together into a longer command, like
1 + 2*4
for instance – it also lets us put functions together and even combine functions with operators if we so desire. For example, the following is a perfectly legitimate command:
sqrt( 1 + abs(-8) )
## [1] 3
When R executes this command, starts out by calculating the value of
abs(-8)
, which produces an intermediate value of
8
. Having done so, the command simplifies to
sqrt( 1 + 8 )
. To solve the square root
30
it first needs to add
1 + 8
to get
9
, at which point it evaluates
sqrt(9)
, and so it finally outputs a value of
3
.
Function Arguments, Their Names and Their Defaults
There’s two more fairly important things that you need to understand about how functions work in R, and that’s the use of “named” arguments, and default values" for arguments. Not surprisingly, that’s not to say that this is the last we’ll hear about how functions work, but they are the last things we desperately need to discuss in order to get you started. To understand what these two concepts are all about, I’ll introduce another function. The
round()
function can be used to round some value to the nearest whole number. For example, I could type this:
round( 3.1415 )
## [1] 3
Pretty straightforward, really. However, suppose I only wanted to round it to two decimal places: that is, I want to get
3.14
as the output. The
round()
function supports this, by allowing you to input a second argument to the function that specifies the number of decimal places that you want to round the number to. In other words, I could do this:
round( 3.14165, 2 )
## [1] 3.14
What’s happening here is that I’ve specified
two
arguments: the first argument is the number that needs to be rounded (i.e.,
3.1415
), the second argument is the number of decimal places that it should be rounded to (i.e.,
2
), and the two arguments are separated by a comma. In this simple example, it’s quite easy to remember which one argument comes first and which one comes second, but for more complicated functions this is not easy. Fortunately, most R functions make use of
argument names
. For the
round()
function, for example the number that needs to be rounded is specified using the
x
argument, and the number of decimal points that you want it rounded to is specified using the
digits
argument. Because we have these names available to us, we can specify the arguments to the function by name. We do so like this:
round( x = 3.1415, digits = 2 )
## [1] 3.14
Notice that this is kind of similar in spirit to variable assignment (Section @(assign), except that I used
=
here, rather than
<-
. In both cases we’re specifying specific values to be associated with a label. However, there are some differences between what I was doing earlier on when creating variables, and what I’m doing here when specifying arguments, and so as a consequence it’s important that you use
=
in this context.
As you can see, specifying the arguments by name involves a lot more typing, but it’s also a lot easier to read. Because of this, the commands in this book will usually specify arguments by name, 31 since that makes it clearer to you what I’m doing. However, one important thing to note is that when specifying the arguments using their names, it doesn’t matter what order you type them in. But if you don’t use the argument names, then you have to input the arguments in the correct order. In other words, these three commands all produce the same output…
round( 3.14165, 2 )
## [1] 3.14
round( x = 3.1415, digits = 2 )
## [1] 3.14
round( digits = 2, x = 3.1415 )
## [1] 3.14
but this one does not…
round( 2, 3.14165 )
## [1] 2
How do you find out what the correct order is? There’s a few different ways, but the easiest one is to look at the help documentation for the function (see Section 4.12. However, if you’re ever unsure, it’s probably best to actually type in the argument name.
Okay, so that’s the first thing I said you’d need to know: argument names. The second thing you need to know about is default values. Notice that the first time I called the
round()
function I didn’t actually specify the
digits
argument at all, and yet R somehow knew that this meant it should round to the nearest whole number. How did that happen? The answer is that the
digits
argument has a
default value
of
0
, meaning that if you decide not to specify a value for
digits
then R will act as if you had typed
digits = 0
. This is quite handy: the vast majority of the time when you want to round a number you want to round it to the nearest whole number, and it would be pretty annoying to have to specify the
digits
argument every single time. On the other hand, sometimes you actually do want to round to something other than the nearest whole number, and it would be even more annoying if R didn’t allow this! Thus, by having
digits = 0
as the default value, we get the best of both worlds.