Suppose you’ve become involved in a clinical trial in which you are testing a new antidepressant drug called Joyzepam. In order to construct a fair test of the drug’s effectiveness, the study involves three separate drugs to be administered. One is a placebo, and the other is an existing antidepressant / anti-anxiety drug called Anxifree. A collection of 18 participants with moderate to severe depression are recruited for your initial testing. Because the drugs are sometimes administered in conjunction with psychological therapy, your study includes 9 people undergoing cognitive behavioural therapy (CBT) and 9 who are not. Participants are randomly assigned (doubly blinded, of course) a treatment, such that there are 3 CBT people and 3 no-therapy people assigned to each of the 3 drugs. A psychologist assesses the mood of each person after a 3 month run with each drug: and the overall improvement in each person’s mood is assessed on a scale ranging from −5 to +5.
With that as the study design, let’s now look at what we’ve got in the data file:
So we have a single data frame called clin.trial, containing three variables; drug, therapy and mood.gain. Next, let’s print the data frame to get a sense of what the data actually look like.
For the purposes of this chapter, what we’re really interested in is the effect of drug on mood.gain. The first thing to do is calculate some descriptive statistics and draw some graphs. In Chapter 5 we discussed a variety of different functions that can be used for this purpose. For instance, we can use the xtabs() function to see how many people we have in each group:
xtabs( ~drug, clin.trial )
## drug
## placebo anxifree joyzepam
## 6 6 6
Similarly, we can use the aggregate() function to calculate means and standard deviations for the mood.gain variable broken down by which drug was administered:
Finally, we can use plotmeans() from the gplots package to produce a pretty picture.
library(gplots)
plotmeans( formula = mood.gain ~ drug, # plot mood.gain by drug
data = clin.trial, # the data frame
xlab = "Drug Administered", # x-axis label
ylab = "Mood Gain", # y-axis label
n.label = FALSE # don't display sample size
)
The results are shown in Figure 14.1, which plots the average mood gain for all three conditions; error bars show 95% confidence intervals. As the plot makes clear, there is a larger improvement in mood for participants in the Joyzepam group than for either the Anxifree group or the placebo group. The Anxifree group shows a larger mood gain than the control group, but the difference isn’t as large.
The question that we want to answer is: are these difference “real”, or are they just due to chance?
## Warning: package 'gplots' was built under R version 3.5.2
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess