Skip to main content
Statistics LibreTexts

8.3: The Concept of Sampling

  • Page ID
    64178

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Suppose that we have identified a population for a research study. As discussed in the previous section, if we could contact everyone in the population, and if we could observe all our measurements of interest on each individual or item in the population, then we would definitively know the characteristic of interest, or parameter, for the population. The process of observing an entire population is called a census.

    Definition: Census

    Observing every individual or item in a population is called a census.

    In the United States a type of census is conducted every ten years. Article I, Section 2, Clause 3 of the United States Constitution states:

    The actual enumeration shall be made within three Years after the first Meeting of the Congress of the United States, and within every subsequent Term of ten Years, in such Manner as they shall by Law direct.

    The need for the census every ten years in the United States is for the express purpose of apportioning representation in the House of Representatives, though the demographic data subsequently gathered from the census is used for policymaking and social research. The key words in this clause with respect to the census are actual enumeration, which means that everyone should be counted in the census. This is a monumental task. According to the U.S. Census Bureau, in 2010 there were 635,000 individuals employed to obtain the census data at a cost of about thirteen billion dollars (see census.gov for more facts of the U.S. Census).

    On a much smaller scale, a census is conducted anytime a course instructor reports the average grade on a class exam. The population consists of all the students in the course, where the measurement of interest for each student is the grade on the exam. The parameter is the average of all the scores on the exam. An example of this type of hypothetical census is discussed earlier where the population parameters for two small classes are known because all the student grades have been observed.

    As an example, consider a major concern in education. Many studies have demonstrated that African American students have lower grade point averages than their white counterparts after the first year of college (Mannan et al. 1986; Spenner et al. 2004; Jay and D’Augelli 1991; Paige and Witty 2010; Condron et al. 2013; Lorah and Ndum 2013; Miksic 2014). This is known as the achievement gap. This problem is important as success during the first year of college is associated with measures of long-term success such as graduation rates (Gershenfeld et al. 2016; Oguntunde et al. 2018; Olneck 2005).

    This type of study is quite frequently achieved using a full census of the populations involved. Many universities and colleges provide real-time assessment of grade point averages so that the achievement gap between different racial and ethnic groups can be monitored. These studies are used to shape policy, assess programs, and design intervention methods to aid in student achievement. A census is practical in these cases because the university has easy electronic access to all the records of the students, and a statistician or data scientist can produce the relevant average of the grade point averages based on the complete census of all the students. However, the results of any study would only apply to the students at that college or university.

    Now consider a researcher attempting to do the same type of study on a much wider population. For example, suppose that we are interested in comparing the achievement gap between community colleges and universities at the end of the first year. The very thought of attempting to contact everyone, and moreover, obtaining sensitive information like grade point averages may seem overwhelming, and indeed it is. Not only would the process be time consuming and expensive, but there would also be serious legal issues associated with attempting to get information on grade and race, unless the data were coded in some way. So how do researchers perform studies like this in a way that the results provide more universal conclusions?

    Let us examine a study of overall college performance, which includes observations on student demographics, pre-college academic and racial experiences, and in-college attitudinal and behavioral characteristics (Nettles et al. 1986). The conclusions of the study are based on a survey of 4,094 students and 706 faculty from 30 colleges and universities in the southern and eastern regions of the United States. The first thing we notice is that the researchers did not observe the characteristics of all the students and faculty at the colleges and universities. In fact, they only observed the characteristics of a relatively small proportion of these students and faculty. This is called a sample from the population.

    Definition: Sample

    A sample is any observed part of a population.

    Using a representative sample is how the researchers in this study were able to study these effects without getting data from every student from 30 colleges and universities in the southern and eastern regions of the United States. The researchers only observed a small part of the population.

    The concept of sampling is quite old (Stephan 1948). One can imagine that any chef, upon preparing a meal for guests, will try some of what was prepared to ensure quality. Many avid readers will scan a few pages of a book to see if they think they will enjoy reading it. These are both examples of sampling. While a census is mandated by the constitution in the Unites States, sampling was often used to estimate populations in other countries. The first census in Great Britain occurred 1801, and prior to that time the population was estimated using sampling techniques. Sampling continued to be used throughout the 19th century for agricultural crop and livestock estimates, economic statistics, social surveys, and health surveys. The formal mathematical development of statistical inference in the early part of the 20th century also formalized methods for sampling. A few of these methods will be highlighted in this chapter.

    Given that not all the individuals in the samples were observed, how can the researchers conclude that the trends observed in the data can be generalized to the entire population? The short answer is that they can never be completely sure the data they observe in a sample from a population reflects what is going on in the entire population. How well the conclusions of the study accurately exemplify the population depends on how well the sample reflects the overall characteristics of the population. If the characteristics of a sample are a good reflection of the characteristics of a population, then such a sample is said to be representative.

    Definition: Representative Sample

    A sample from a population is representative if the characteristics of the sample are like those of the population.

    Suppose that a friend brings you a puzzle composed of 50 pieces with a blank box. That is, you have no idea what picture the puzzle would make if you put all the pieces together. Your friend will let you look at ten pieces, from which you will try to describe the picture that the puzzle represents. You friend hands you the ten pieces given in Figure \(\PageIndex{1}\). What do you guess? You can only use the information from the pieces you observe. They are all shades of blue or white, and a couple of the pieces show what appears to be clouds. At this point your best guess may be, since there is no other information available, that the picture is of a blue sky with clouds.

    A blue and white sky

AI-generated content may be incorrect.
    Figure \(\PageIndex{1}\): Ten pieces from the unknown puzzle picture (Public domain image created by Alan M. Polansky).

    Now suppose your friend had given you the ten pieces shown in Figure 8.2 instead. Now there is some variety in the pieces that you observe. It may be difficult to see, but two pieces show bare tree branches with a sky background, one piece shows trees with dense leaves at a distance, and two other show water ripples. Now we could guess that the picture is of a lake or pond with dense trees in the background and a bare tree closer in the foreground. Since the trees in the background have lush dark green leaves, we could guess that the tree in the foreground is, perhaps, dead.

    A tree with a blue sky

AI-generated content may be incorrect.
    Figure \(\PageIndex{2}\): Ten other pieces from the unknown puzzle picture (Image created by the author).

    The full picture of the puzzle is given in Figure 8.3. This is a dead tree in the middle of Shabbona Lake in northern Illinois, approximately seventy miles west of Chicago, an artificial lake created in 1975 by damming a tributary of the Fox River. Its name derives from the Potawatomi leader Shabbona. The tree was on the land when the land was flooded by the dam, and the preserving nature of the lake water has kept the tree standing for over 45 years. At this point you might rightly say that the pieces selected in Figure 8.1 were not selected fairly, as they were all taken from an area of the picture that with just sky, no trees or water. This is what happens when a researcher chooses a non-representative sample from a population. You get a distorted picture of what the population looks like, and the conclusions that you make can be way off. Hence the puzzle pieces selected in Figure 8.1 are not a representative sample of the corresponding population.

    A tree in the water

AI-generated content may be incorrect.
    Figure \(\PageIndex{3}\): The actual picture that the puzzle pieces were taken from (photograph by author).

    The pieces selected in Figure 8.2 give us much more accurate information about what the picture looks like. Note that we cannot reconstruct the entire picture from those pieces. For example, none of the pieces we observe show the reflection of the dead tree in the water, so we would not be able to guess that the dead tree is close enough to the lake to have its reflection in the water. Hence, we can still be fooled, but nonetheless we were able to obtain quite a bit of information. One can even notice one piece with the leafy trees in the distance, reflective of the fact that trees takes up a small portion of the entire picture, whereas most of the observed pieces have either water or sky, which takes up most of the picture. This is an example of a sample that could be representative.

    Now let us consider a slightly more practical example of sampling. Suppose that a professor is giving a final exam in a large lecture class of 250 students. As the students turn in the exam, the papers are stacked in the order in which they are turned in. That is, the students who turned the exam in first are at the bottom of the stack and the students who turned the exam in last are at the top of the stack. The department chair catches the professor in their office right after the exam and is anxious to know how the students did on the exam. The professor tells the chair that there are 250 exams, and it will take some time to grade them, but that you can take a sample of ten exams, grade them in an hour, and report to the chair how those ten students did on the exam. Now the professor has an important choice to make. Let us suppose that they want to give a clear and fair indication of how the class did to the chair. Otherwise, reporting too good or too bad results will give a biased version of what the chair will find out later when all the exams have been graded. How should the professor choose the exams?

    One obvious choice is to take ten exams from the top of the stack and grade them. But is this a good idea? Specifically, will this method of choosing the exams provide a fair representation of how the class did as a whole? That is, would this sample be representative? The main question here is whether the order in which the exams were turned in is associated with the students’ performance on the exam. It may be argued that students who are less familiar with the material may take longer to complete the exam, while it could also be the case that some students who perform better may be more careful, may have more to write, and will take longer to complete the exam. The key here is that the professor really doesn't know if these two things are related, and therefore they need to be careful. If there is a relationship, any choice made using turn-in order may give a biased idea as to how the class did.

    To see how this could happen, consider the exam scores given in Table 8.1. This is a contrived example of a class of twenty-five students who took an exam with exam scores given in the order that they were turned in. The scores were simulated on a computer in such a way that the exam scores tend to be higher for exams that are turned in later. Suppose that the professor graded the first five exams that were turned in. The average score on the first five exams is equal to

    \[\frac{1}{5}\times(65+76+74+58+68)=68.2,\]

    which indicates and average score in the D range. Now suppose that we graded the last five exams that were turned in. The average score on the last five exams is equal to \[\frac{1}{5}\times(95+81+97+80+97)=90.2,\]

    which indicates and average score in the A- range. But neither of samples provides a good indication of how the class did as a group, as neither sample is representative. If the average is computed on all the grades, then we get 78.0.

     

    Table 8.1. Twenty-five exam scores simulated in a way that the exam scores tend to increase with the time it took for the student to complete the exam.

    Order

    Score

    1

    65

    2

    76

    3

    74

    4

    58

    5

    68

    6

    59

    7

    74

    8

    71

    9

    67

    10

    74

    11

    89

    12

    74

    13

    75

    14

    75

    15

    74

    16

    79

    17

    88

    18

    87

    19

    82

    20

    91

    21

    96

    22

    81

    23

    97

    24

    80

    25

    97

    Of course, the professor could attempt to construct a representative sample in a very contrived way, so that the average comes out near the true average. But this would require the professor to know all the exam scores in advance, but such an approach is impractical. In essence, the professor needs a way to pick the exams to grade using a method that is not related in any way to the trend in the exam scores. The answer that statisticians and data scientists have used for the last century is to use a random method of picking items for a sample. If the random mechanism is chosen carefully, such a method can be an effective and relatively simple method for choosing items to observe in a sample.


    This page titled 8.3: The Concept of Sampling is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by .

    • Was this article helpful?