Skip to main content
Statistics LibreTexts

21.3: R Functions

  • Page ID
    57819
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    There are a plethora of functions available in R. The key is to use functions to become familiar with them. Also, because you will always need to use new functions, it is very important to be familiar with the R help files for the functions.

    Basic Functions

    The following are basic functions in R. You will use these frequently.

    Function Description
    source() Run an external R script
    head() Show the first six elements in the vector
    tail() Show the last six elements in the vector
    length() Calculate the length of the vector
    seq() Create a sequence of values
    summary() Print the six-number summary
    mean() Calculate the arithmetic mean
    median() Calculate the median
    sum() Calculate the sum of the elements in the vector
    sd() Calculate the standard deviation
    var() Calculate the variance

    Please know what each does. If necessary, use the help file for the given function. The more you familiarize yourself with the help files, the more they will tell you.

    Matrix Functions

    R can also do arithmetic on matrices. Since the internal computer calculations are all based on matrices, it is important to be familiar with matrix operations to make sure you know what \R\ is doing (can check the output).

    Element-wise Functions

    + usual matrix addition
    * Hadamard product
    ^ exponentiation

    Additional Matrix Functions

    %*% usual matrix product
    t() transpose
    solve() matrix inverse

    Probability Functions

    At the core of statistics is probability and probability distributions. These will be important in helping you better understand the effects of randomness on the estimates... and the effects of violating procedure requirements on those estimates.

    Generic Functions

    set.seed() set the specified random number seed
    sample() randomly sample from a vector. This can also be used to sample from a discrete Uniform distribution

    Probability Function Naming Logic

    Each named probability function in R can be parsed into two parts, the stem and the prefix. The stem specifies the probability distribution, whereas the prefix specifies what aspect of the distribution you wish to access.

    This is an exhaustive list of the standard prefixes:

    d specifies the likelihood value. If the distribution is discrete, then this will also be the probability, otherwise it is the density.
    p specifies the cumulative probability, P[X ≤ x]. This prefix will rarely be available for multivariate functions.
    q specifies the quantile, the value of x that produces the probability. This prefix will rarely be available for multivariate functions.
    r specifies a random value. In simulation, this is the most important prefix, as it produces a random sample of a given size from the specified distribution.

    The second part is the stem. This specifies the distribution involved. Here is a list of some of the more interesting stems available:

    binom Binomial distribution. One needs to specify the number of trials, size, and the success probability, prob.
    cauchy Cauchy distribution. Optionally, one can specify the location and the spread. The default is the standard Cauchy with location of 0 and scale of 1.
    chisq Chi-squared distribution. One needs to specify the number of degrees of freedom, df.
    exp Exponential distribution. One needs to specify the rate, λ.
    f Snedecor's F distribution. One needs to specify both degrees of freedom, with the numerator preceding the denominator degrees of freedom, df1 and df2.
    gamma Gamma distribution. One must specify the shape parameter, α, since there is no default. The rate parameter has a default value of β = 1.
    norm Normal (Gaussian) distribution. Optionally, one can specify the mean m and standard deviation s of the Normal. This defaults to a standard Normal distribution.
    pois Poisson distribution. One needs to specify the expected value, lambda, λ.
    t Student's t distribution. One needs to also specify the number of degrees of freedom, df.
    unif Continuous Uniform distribution. Optionally, one can specify the minimum and maximum value. The default is the standard Uniform with min of 0 and max of 1.

    Testing Functions

    Because R is a statistical program, it is able to perform all of the basic statistical tests and procedures. These are the related functions.

    aov Analysis of Variance procedure
    lm OLS regression
    summary.aov ANOVA table of a linear model
    summary.lm Regression table of a linear model
    residuals calculated residuals from a model
    confint confidence interval for estimated parameters
    predict estimation and prediction of a value

    Non-Base Testing Functions

    runs.test KnoxStats the runs test
    hetero.test KnoxStats univariate test of heteroskedasticity
    fligner.test stats test of heteroskedasticity across groups
    bptest lmtest Breusch-Pagan test of heteroskedasticity

    Control Functions

    for for-loops
    if if-then statements
    ifelse triary operator
    numeric creates a vector in memory

    Graphical Functions

    R is a full-fledged graphical system. In fact, this is what set R apart from its competitors (and still does!). Every pixel of a graphic can be modified in R. This book relies on the basic R graphic engine. There are two other graphical engines (metaphors): grid and ggplot2. Base-R graphics will always serve you. With that being said, ggplot2 is the modern graphics engine for R. It serves as a wrapper for the basic graphics, making some graphics much easier to create.

    qqnorm Q-Q plot for a Normal target
    qqline plots the diagonal line in a normal Q-Q plot
    barplot bar chart
    boxplot box-and-whiskers plot
    hist histogram
    histogram histogram*
    overlay histogram with a distribution overlaying it*
    plot plots a basic graphic
    par sets the parameters for the future graphic
    plot.new starts a new graphic
    plot.window sets the size of a new window
    axis creates axes
    title plots a title on the specified axis
    points plots points on the graphic
    lines plots lines on the graphic

    * Those graphical functions marked with an asterisk are found in the KnoxStats package.


    This page titled 21.3: R Functions is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Ole Forsberg.

    • Was this article helpful?