8.11: Collaborative Activity
- Page ID
- 64307
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)This collaborative activity will explore sampling from a small population. For this activity you will be supplied with a ten-sided die and a twenty-sided die. The population that you will be sampling from is represented by the entries in Table 8.5. There are 200 cells in this table representing 200 individuals that work at a small manufacturing company. Within each cell is a salary that has been coded in thousands of dollars and a designator as to the type of job that the individual performs. The job designators are M for management, E for engineering, A for assembly line worker, D for data processor, and H for human resources. For example, the entry A42 is used to indicate an assembly line worker whose salary is $42,000 per year, rounded to the nearest thousand dollars. The data have been assembled in the table so that the first four rows are all management employees, the next first four rows are all engineers, the next six rows are assembly line workers, the next four rows are data processors and analysists, and the final two rows are human resources workers.
Table 8.5 The hypothetical population to be used for the collaborative activity.
|
M102 |
M98 |
M99 |
M98 |
M114 |
M99 |
M98 |
M116 |
M105 |
M111 |
|
M106 |
M127 |
M100 |
M111 |
M117 |
M111 |
M117 |
M107 |
M130 |
M119 |
|
M109 |
M97 |
M119 |
M114 |
M123 |
M110 |
M119 |
M100 |
M117 |
M105 |
|
M213 |
M125 |
M98 |
M99 |
M105 |
M111 |
M99 |
M97 |
M111 |
M107 |
|
E106 |
E91 |
E94 |
E110 |
E106 |
E97 |
E105 |
E112 |
E117 |
E104 |
|
E107 |
E109 |
E94 |
E98 |
E99 |
E91 |
E98 |
E98 |
E97 |
E118 |
|
E96 |
E100 |
E98 |
E99 |
E111 |
E100 |
E112 |
E122 |
E104 |
E96 |
|
E91 |
E104 |
E91 |
E104 |
E107 |
E111 |
E112 |
E110 |
E109 |
E110 |
|
A45 |
A48 |
A45 |
A47 |
A47 |
A48 |
A44 |
A46 |
A47 |
A47 |
|
A45 |
A47 |
A42 |
A44 |
A47 |
A46 |
A51 |
A54 |
A48 |
A48 |
|
A49 |
A44 |
A46 |
A50 |
A52 |
A43 |
A46 |
A49 |
A50 |
A49 |
|
A48 |
A44 |
A47 |
A49 |
A46 |
A46 |
A50 |
A51 |
A46 |
A42 |
|
A45 |
A55 |
A46 |
A46 |
A50 |
A49 |
A49 |
A48 |
A45 |
A45 |
|
A46 |
A43 |
A47 |
A48 |
A48 |
A45 |
A51 |
A50 |
A48 |
A58 |
|
D85 |
D89 |
D72 |
D87 |
D78 |
D73 |
D88 |
D96 |
D80 |
D89 |
|
D91 |
D101 |
D73 |
D72 |
D90 |
D76 |
D71 |
D87 |
D80 |
D79 |
|
D72 |
D84 |
D93 |
D76 |
D95 |
D89 |
D88 |
D78 |
D79 |
D95 |
|
D75 |
D101 |
D88 |
D75 |
D75 |
D75 |
D72 |
D85 |
D73 |
D76 |
|
H64 |
H61 |
H70 |
H77 |
H77 |
H78 |
H64 |
H72 |
H72 |
H79 |
|
H80 |
H101 |
H71 |
H67 |
H72 |
H77 |
H78 |
H70 |
H83 |
H68 |
Your group will first take a simple random sample of 20 of the 200 individuals at the company. To do this you will need one person to roll the dice and a second to write down the data. To take the simple random sample, use the following process:
- Roll both dice.
- Locate the individual in the column of Table 8.5 indicated by the ten-sided die and the row that is indicated by the twenty-sided die. Include this individual in the sample.
- Check this selection against the individuals already in the sample. If this individual has already been included, re-roll the dice and choose again. If this individual has not been selected yet, proceed to the next step.
- Write down the data for the selected individual on a table like the one shown in Table 8.6.
- Repeat this process until you have completed your sample of 20 individuals.
Table 8.6 Data Collection Sheet for the Collaborative Activity
|
Sample Type |
Cluster or Strata |
Sample |
|
Simple Random Sample |
||
|
Cluster Sample |
||
|
Stratified Sample |
M |
|
|
E |
||
|
A |
||
|
D |
||
|
H |
As an example, suppose that we begin the process by rolling an 8 on the twenty-sided die and a 9 on the ten-sided die. This corresponds to the E109 entry. This data would then be added to the data sheet, as show in Table 8.7. Next, suppose that we roll a 2 on the twenty-sided die and a 7 on the ten-sided die. This corresponds to the M117 entry. This data would then be added to the data sheet, as show in Table 8.7. Assume the next two rolls are (19,1) and (3,9). Note that we get the same data for the rolls (2,7) and (3,9), but we include both data points. This is because two individuals have the same salary in the same category. Because this corresponds to two different individuals from the population, we include both data points. Suppose the next five rolls are (12,9), (5,3), (18,2), (5,9), and (7,10), with the corresponding data shown in Table 8.7. At this point we roll (5,3). In this case, not only is the data the same but the roll that selected the data is the same as we previously rolled. Because this individual has already been selected for the sample, we skip it and roll again, getting (1,3), which corresponds to the data M99. The process continues in a similar manner yielding the example data shown in Table 8.7.
Table 8.7 Data Collection Sheet for the Collaborative Activity
|
Sample Type |
Cluster or Strata |
Sample |
|
Simple Random Sample |
E109, M117, H64, M117, A46, E94, D101, E117, E96, M99, M106, A45, A47, M102, M114, M213, M109, D87, M125, E111 |
|
|
Cluster Sample |
10 |
M105, D79, H79, E96 |
|
2 |
M125, A47, D101, D101 |
|
|
5 |
E107, H77, D75, M123 |
|
|
9 |
M130, A45, D80, M105 |
|
|
3 |
M98, M100, D93, D72 |
|
|
Stratified Sample |
M |
M98, M100, M119, M109 |
|
E |
E100, E110, E112, E122 |
|
|
A |
A46, A46, A49, A42, A47, A58 |
|
|
D |
D78, D73, D76, D88 |
|
|
H |
H72, H70 |
Your group will next take a cluster sample of 20 of the 200 individuals at the company. To do this, you will need one person to roll the dice and a second to write down the data. The clusters will be taken to be the columns of Table 8.5. The process detailed below will take a simple random sample of five clusters, and then take a simple random sample of four individuals from each cluster. To take a cluster sample, use the following process:
- Roll the ten-sided die.
- Locate the column corresponding to the outcome of the roll of the ten-sided die. This is the sampled cluster. If this cluster has been previously sampled, return to Step 1. Write the column number in the Cluster column of Table 8.6.
- Roll the twenty-sided die and locate the roll corresponding to this outcome in the column corresponding to the sampled cluster. This is the individual to be included in the sample.
- Check this selection against the individuals that are already in the sample. If this individual has already been included in the sample, re-roll the dice and choose again. If this individual has not been selected yet, proceed to the next step.
- Write down the data for the selected individual on a table like the one shown in Table 8.6 in the row corresponding to the current cluster.
- If you have selected fewer than four individuals from this cluster, proceed to Step 3. If you have selected four individuals from this cluster, and fewer than five clusters have been sampled, proceed to Step 1. If you have selected four individuals from this cluster and five clusters have been sampled, the process is complete.
For example, suppose that we roll the ten-sided die and the result is 10. This indicates that the tenth column is the first cluster used in our cluster sample. This is indicated on the example data sheet in Table 8.7. We now roll the twenty-sided die four times. Suppose that these rolls are 3, 3, 16, and 19, which correspond to the data points M105, D79, H79, and E96, as indicated in Table 8.5. Notice that the roll equal to 3 is repeated, so the second one is skipped and we need another roll. Suppose that this roll is equal to 7. We now select the data values corresponding to these rows in the tenth column of the data table. These values are recorded in the example data sheet in Table 8.7. Now that we have five observations for that cluster, we need to sample another cluster. Suppose that we roll the ten-sided die and observe a 2. This indicates that the second column is the second cluster used in our cluster sample. This is indicated on the example data sheet in Table 8.7. We now roll the twenty-sided die four times. Suppose that these rolls are 4, 10, 15, and 17, which correspond to the data points M125, A47, D101, and D101, as indicated in Table 8. Note that even though the data D101 is repeated, the data come from two different individuals.
Finally, your group will take a stratified sample. The strata will correspond to the job types. Stratified sampling takes some additional work as the sizes of the strata are different. For example, there are 40 management employees while there are only 20 human resource employees. This means that we need to sample twice as many management employees as human resource employees. Because we need a sample of size 20, the sample sizes for the strata will correspond to the number of rows in the table for each job type. Therefore, your group should sample 4 management employees, 4 engineers, 6 assembly line workers, 4 data analysts, and 2 human resource employees. To take the sample of management employees, use the following process:
- Roll the ten-sided die twice. The first roll will select the row that the individual will be sampled from, and the second roll will select the column. If the first roll is greater than four, continue rolling until you get a number between 1 and 4 because there are only four rows of management employees.
- Once the row and column have been selected, make sure that the individual has not already been selected for the sample. If they have, return to Step 1. If they have not, write the data for that individual in the M row of the Stratified Sample part of the data table.
- If you have selected four individuals, you have completed the sample for the management employees and you will next sample from the engineering employees. If you have not yet selected four individuals, return to Step 1.
This process is then repeated for each of the remaining strata, yielding a stratified sample of size 20. Remember that the sample sizes differ depending on which strata you are sampling from: 4 management employees, 4 engineers, 6 assembly line workers, 4 data analysts, and 2 human resource employees. Once the process is complete, the data table assembled by your group should appear similar to the one shown in Table 8.7.
Take the management stratum, for example. Suppose that we roll a ten-sided die and observe a 1. This indicates that we will sample an employee from the first row of management employees. The ten-sided die is rolled again and this time we observe 7. The we pick the management employee from the first row and the seventh column which is the data value M98, as indicated in Table 8.7. Next, we roll 9, but because there are only four rows of management employees, we roll again and observe a 3. Therefore, we will sample an employee from the third row of management employees. Rolling the die again, suppose we observe 8, then we pick the data from the third row of management employees in the 8th column, which is the data point M100, as indicated in Table 8.7. We continue this process until we have sampled four individuals from the management employees, as indicated by the example data in Table 8.7.
Calculations
- We will first calculate the percentage of individuals in each sample from each job type. Take the number of individuals for a job type and then divide by the number of individuals in the sample, and then multiply by 100%. For example, using the data in Table 8.7, nine managers were selected for the simple random sample. Because the total number of individuals in the simple random sample is 20, the total percentage of individuals in the sample that are managers is \(9\div 20\times 100%=18%\). These percentages can be reported on a table like that of Table 8.8. The percentages for the example data can be found in Table 8.9.
Table 8.8 Percentage of each job type in the observed samples .
|
Sample Type |
Job Type |
Percentage |
Average |
|
Simple Random |
M |
||
|
Sample |
E |
||
|
A |
|||
|
D |
|||
|
H |
|||
|
Cluster Sample |
M |
||
|
E |
|||
|
A |
|||
|
D |
|||
|
H |
|||
|
Stratified |
M |
||
|
Sample |
E |
||
|
A |
|||
|
D |
|||
|
H |
Table 8.9 Percentage of each job type in the observed samples for the example data.
|
Sample Type |
Job Type |
Percentage |
Average |
|
Simple Random |
M |
45 25 15 |
122.44 |
|
Sample |
E |
105.40 |
|
|
A |
46.00 |
||
|
D |
10 |
94.00 |
|
|
H |
5 |
64.00 |
|
|
Cluster Sample |
M |
35 |
112.29 |
|
E |
10 |
101.50 |
|
|
A |
10 |
46.00 |
|
|
D |
35 |
85.86 |
|
|
H |
10 |
78.00 |
|
|
Stratified |
M |
20 |
106.50 |
|
Sample |
E |
20 |
111.00 |
|
A |
30 |
48.00 |
|
|
D |
20 |
78.75 |
|
|
H |
10 |
71.00 |
-
We will next calculate the sample average salary for each job type and for each sample type. Beginning with the management employees in the simple random sample, first identify the management employees in the simple random sample and consider the salaries for those employees. For the example data, these salaries are: 117, 117, 99, 106, 102, 114, 213, 109, 125. To find the average of these values, add them up and divide by the number of data values:
\[ (117+117+99+106+102+114+213+109+125)\div 9=1102\div 9=122.44. \nonumber \]
Next, move to the engineering employees with the same process. Identify all the engineers in the sample and consider their salaries. Add these together and divide by the number of data values. For the example data, these values are 109, 94, 117, 96, and 111. To find the average of these values, add them up and divide by the number of data values:
\[(109+94+117+96+111)\div 5=527\div 5=105.4. \nonumber \]
This process will then continue for the remaining job types in the simple random sample. Then apply the same process for the cluster sample and the stratified sample. The remaining averages for the example data are given on Table 8.9.
Questions
- Now that you have taken each type of sample, think about comparing them. Suppose that someone asked you why a cluster sample is not the same as a simple random sample. How would you explain the idea that they are different?
- Suppose that someone asked you why a stratified sample is not the same as a simple random sample. How would you explain the idea that they are different?
- Now consider the percentage of each job type represented in your three samples, and compare them to the percentage of each job type in the population. Which of your three samples seemed to do the best job of representing the population in terms of these percentages?
- Finally, consider the average salary for each job type represented in your three samples, and compare them to the average salary for each job type in the population. For the population, the average salary is 111.575 for managers, 103.450 for engineers, 47.416 for assembly line workers, 82.525 for data analysts, and 74.050 for human resource employees. Which of your three samples seemed to do the best job of representing the population in terms of these averages?

