3.4: Observational Studies
- Page ID
- 58838
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Many studies cannot be studied using designed experiments because the researcher does not control the conditions under which the experimental data is observed. There are several reasons for this. Sometimes it is simply not possible to control the factors of interest. If you are interested in animal behavior prior to an earthquake, it is not possible to schedule an earthquake of a specified magnitude to occur next week so that you can observe how the animals behave beforehand. In other cases, there are moral and social prohibitions to controlling factors. If you are interested in how long-term exposure to nicotine affects the likelihood of lung cancer, it is simply not ethical to expose individuals to nicotine for the purpose of such an experiment. Similarly, if we are interested in how the poverty rate is related to the likelihood of incarceration, it is not ethical to deprive a specified population from economic advantage in the interest of the experiment.
In these cases what are known as observational studies are used instead. In observational studies the factors thought to influence the outcome, along with the outcome, are observed at the same time. Implicit in this structure is that the researcher does not have control over these factors.
An observational study is an experiment where the researcher does not control or set the factors.
In observational studies the individuals in the study decide for themselves what their behavior will be, and the researchers simply observe the behavior. For example, studies on the effect of nicotine exposure through cigarette smoking are observational studies usually conducted over large populations where the presence of lung cancer and the amount of exposure to cigarette smoking is observed in individuals through a health survey. Many of the studies presented thus far are based on observational studies. For example, if researchers are interested in how the poverty rate is related to the likelihood of incarceration, then they might look at survey data that both observes the income and past incarceration of survey participants.
The main problem with observational studies is the difficultly of attributing what is observed in the experiment to any factor. This is particularly a problem when many possible factors may influence observations, and this problem may be compounded when the factors themselves are related or associated with one another. This problem is known as confounding.
Two or more factors are confounded when the effect of one of the factors cannot be distinguished from the effects of the other factors on the outcome.
In the case of attempting to link lung cancer to cigarette smoking, the use of observational studies could only suggest cigarette smoking as a possible cause. For example, one could argue that there is possibly a genetic link to lung cancer that is responsible for the increased likelihood of contracting the disease, and this genetic condition is also linked to nicotine addiction. Hence, it is the genetic effect that is causing the lung cancer and the addiction to cigarette smoking. However, data from an observational study would clearly indicate that nicotine exposure and the incidence of lung cancer are related, and concluding that there is a link between the two would be tempting.
Early in the study of linking the incidence of lung cancer to cigarette smoking, even famous statisticians such as Joseph Berkson and Sir Ronald A. Fisher did not believe that there was a link (Freedman et al. 2007). Only very carefully designed studies can be used to try to separate out these types of confounding problems. Today it is a known fact in the scientific community that there is a link between cigarette use and the likelihood of developing lung cancer (Witschi 2001; Proctor 2012; Musk and De Klerk 2003; Doll 1998; Wynder 1997; Proctor 2004; de Groot et al. 2018).
Observational studies should be implemented to ensure that confounding is as small a problem as possible, though it must be acknowledged that confounding will always be an issue in these types of studies. One common practice is to limit the scope of the study to reduce the number of factors that could have confounding problems. This is because any factors that differ between comparison groups have the potential to be the cause of any observed differences. For example, in a study of the effect of poverty on health, the scope of the study may be limited to individuals in a certain neighborhood since there may be geographical issues, such as environmental factors, that could also affect health. One could well imagine that a neighborhood with a lower income level might also be closer to an industrial area where pollution may be a problem, while a comparatively more affluent neighborhood might not have that problem. If a difference in health is observed between the two areas, it cannot be determined whether the exposure to pollution or the poverty would be to blame.
A major problem is that it is not always clear what potential confounding factors could be present, and it may be quite difficult to control for these factors even when they are known. As an example, consider the Coronary Drug Project, a carefully designed study to evaluate drugs and their potential for the prevention of heart attacks (Coronary Drug Project Research Group 1973). To reduce potential confounding problems the study focused on 8,341 middle-aged men with heart trouble. One potentially helpful drug included in the study was the cholesterol reducing drug clofibrate. Over the range of the study the mortality rate of people not taking any drug, the control group, and the mortality rate of people taking clofibrate was roughly the same, so it appeared that the drug was not helpful. However, there was a problem with the data in that some individuals in the study did not take most of the drug, and hence were not participating. A closer look at the data revealed that those who took less of the drug had higher mortality rates than people who took more of the drug. Does this mean the drug works? Maybe not, as it is possible that those who took the drug lived generally healthier lifestyles than those who did not. These issues make it very difficult to determine if the drug was effective or not (Coronary Drug Project Research Group 1980; Freedman et al. 2007).
Further examples of confounding problems can be considered when assessing treatments for substance abuse. To judge whether a treatment is effective, changes in an individual’s behavior, over and above what would have occurred without treatment, is usually used to quantify change attributable to treatment (Pierce et al. 2017; Prochaska et al. 1993). This is usually accomplished by comparing the overall behavior of groups of individuals who had the treatment to other groups of individuals who did not have the treatment. The effectiveness of a treatment is often quantified by comparing outcomes between periods when subjects are being treated with periods when they are not. One potential problem is that individuals with substance problems often cycle in and out of treatment, and therefore the application of the treatment varies with time (Condron et al. 2013). In these cases, for example, this can cause a confounding issue with past substance abuse, in that the effectiveness of the current treatment cycle could very well be influenced by the frequency and type of substance abuse in the patient's history as well as the number and types of treatments they had received in the past (Pierce et al. 2017).
Considering research in social justice, many studies point to an apparent relationship between race and domestic violence. In surveys, studies of calls to the police, and case studies of women’s shelters and emergency rooms consistently find a disproportionate share of African American offenders and victims (Stets 1991). The inherent conclusion from this research includes the disturbing undertone that there is a link between race and domestic violence. This research often avoids addressing the issue of why these studies show these types of results (Benson et al. 2003, 2004; Benson and Greer 2002).
One study considered this problem with a more sophisticated analysis that was based on the idea that a variety of social forces destroyed the social fabric of African American urban communities, leading to a situation in which a large proportion of the African American population are trapped in socially isolated urban areas with high levels of unemployment. Such environments may feed into a culture that is more tolerant of violence and crime (Wilson 2012; Sampson and Wilson 2020). If this hypothesis is true, then it may follow that an association between domestic violence and race could be explained by the fact that African Americans tend to live in areas that whose social fabric has been destroyed by poverty and unemployment. Therefore, race is confounded with the local economic climate.
To explore this idea, census and survey data were used to carefully compare the domestic violence rates of African Americans and whites who live in comparable economic situations (Benson et al. 2003). The data showed that the domestic violence rates were similar for both races in the same economic environment. This research supports the hypothesis that it is the economic environment, and not race, that is related to domestic violence rates.

