Skip to main content
Statistics LibreTexts

5: Bringing Home the Data

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    In this chapter, we start to get very practical on the matter of tracking down good data in the wild and bringing it home. This is actually a very large and important subject – there are entire courses and books on Experimental Design, Survey Methodology, and Research Methods specialized for a range of particular disciplines (medicine, psychology, sociology, criminology, manufacturing reliability, etc.) – so in this book we will only give a broad introduction to some of the basic issues and approaches.

    The first component of this introduction will give several of the important definitions for experimental design in the most direct, simplest context: collecting sample data in an attempt to understand a single number about an entire population. As we have mentioned before, usually a population is too large or simply inaccessible and so to determine an important feature of a population of interest, a researcher must use the accessible, affordable data of a sample. If this approach is to work, the sample must be chosen carefully, so as to avoid the dreaded bias. The basic structure of such studies, the meaning of bias, and some of the methods to select bias-minimizing samples, are the subject of the first section of this chapter.

    It is more complicated to collect data which will give evidence for causality, for a causal relationship between two variables under study. But we are often interested in such relationships – which drug is a more effective treatment for some illness, what advertisement will induce more people to buy a particular product, or what public policy leads to the strongest economy. In order to investigate causal relationships, it is necessary not merely to observe, but to do an actual experiment; for causal questions about human subjects, the gold standard is a randomized, placebo-controlled, double-blind experiment, sometimes called simply a randomized, controlled trial [RCT], which we describe in the second section.

    There is something in the randomized, controlled experiment which makes many people nervous: those in the control group are not getting what the experimenter likely thinks is the best treatment. So, even though society as a whole may benefit from the knowledge we get through RCTs, it almost seems as if some test subjects are being mistreated. While the scientific research community has come to terms with this apparent injustice, there are definitely experiments which could go too far and cross an important ethical lines. In fact, history has shown that a number of experiments have actually been done which we now consider to be clearly unethical. It is therefore important to state clearly some ethical guidelines which future investigations can follow in order to be confident to avoid mistreatment of test subjects. One particular set of such guidelines for ethical experimentation on human subjects is the topic of the third and last section of this chapter.


    This page titled 5: Bringing Home the Data is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Jonathan A. Poritz via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

    • Was this article helpful?