Skip to main content
Statistics LibreTexts

4.6: Contingency Tables

  • Page ID
    20023
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    A contingency table provides a way of portraying data that can facilitate calculating probabilities. The table helps in determining conditional probabilities quite easily. The table displays sample values in relation to two different variables that may be dependent or contingent on one another. Later on, we will use contingency tables again, but in another manner.

    Example \(\PageIndex{1}\)

    Suppose a study of speeding violations and drivers who use cell phones produced the following fictional data:

    Speeding violation in the last year No speeding violation in the last year Total
    Cell phone user 25 280 305
    Not a cell phone user 45 405 450
    Total 70 685 755

    The total number of people in the sample is 755. The row totals are 305 and 450. The column totals are 70 and 685. Notice that 305 + 450 = 755 and 70 + 685 = 755.

    Calculate the following probabilities using the table.

    1. Find \(P(\text{Person is a cell phone user})\).
    2. Find \(P(\text{person had no violation in the last year})\).
    3. Find \(P(\text{Person had no violation in the last year AND was a cell phone user})\).
    4. Find \(P(\text{Person is a cell phone user OR person had no violation in the last year})\).
    5. Find \(P(\text{Person is a cell phone user GIVEN person had a violation in the last year})\).
    6. Find \(P(\text{Person had no violation last year GIVEN person was not a cell phone user})\)

    Answer

    1. \(\dfrac{\text{number of cell phone users}}{\text{total number in study}}\) = \(\dfrac{305}{755}\)
    2. \(\dfrac{\text{number that had no violation}}{\text{total number in study}} = \dfrac{685}{755}\)
    3. \(\dfrac{280}{755}\)
    4. \(\left(\dfrac{305}{755} + \dfrac{685}{755}\right) - \dfrac{280}{755} = \dfrac{710}{755}\)
    5. \(\dfrac{25}{70}\) (The sample space is reduced to the number of persons who had a violation.)
    6. \(\dfrac{405}{450}\) (The sample space is reduced to the number of persons who were not cell phone users.)
    Exercise \(\PageIndex{1}\)

    Table shows the number of athletes who stretch before exercising and how many had injuries within the past year.

    Injury in last year No injury in last year Total
    Stretches 55 295 350
    Does not stretch 231 219 450
    Total 286 514 800
    1. What is \(P(\text{athlete stretches before exercising})\)?
    2. What is \(P(\text{athlete stretches before exercising|no injury in the last year})\)?

    Answer

    1. \(P(\text{athlete stretches before exercising}) = \dfrac{350}{800} = 0.4375\)
    2. \(P(\text{athlete stretches before exercising|no injury in the last year}) = \dfrac{295}{514} = 0.5739\)
    Example \(\PageIndex{2}\)

    Table shows a random sample of 100 hikers and the areas of hiking they prefer.

    Hiking Area Preference
    Sex The Coastline Near Lakes and Streams On Mountain Peaks Total
    Female 18 16 ___ 45
    Male ___ ___ 14 55
    Total ___ 41 ___ ___
    1. Complete the table.
    2. Are the events "being female" and "preferring the coastline" independent events? Let F = being female and let C = preferring the coastline.
      1. Find P(F AND C).
      2. Find P(F)P(C)
      3. Are these two numbers the same? If they are, then F and C are independent. If they are not, then F and C are not independent.
    3. Find the probability that a person is male given that the person prefers hiking near lakes and streams. Let \(\text{M} =\) being male, and let \(\text{L} =\) prefers hiking near lakes and streams.
      1. What word tells you this is a conditional?
      2. Fill in the blanks and calculate the probability: \(P\)(___|___) = ___.
      3. Is the sample space for this problem all 100 hikers? If not, what is it?
    4. Find the probability that a person is female or prefers hiking on mountain peaks. Let \(\text{F} =\) being female, and let \(\text{P} =\) prefers mountain peaks.
      1. Find \(P(\text{F})\).
      2. Find \(P(\text{P})\).
      3. Find \(P(\text{F AND P})\).
      4. Find \(P(\text{F OR P})\).

    Answers

    a.

    Hiking Area Preference
    Sex The Coastline Near Lakes and Streams On Mountain Peaks Total
    Female 18 16 11 45
    Male 16 25 14 55
    Total 34

    41

    25 100

    b.

    \(P(\text{F AND C}) = \dfrac{18}{100} = 0.18\)

    \(P(\text{F})P(\text{C}) = \left(\dfrac{45}{100}\right) \left(\dfrac{34}{100}\right) = (0.45)(0.34) = 0.153\)

    \(P(\text{F AND C}) \neq P(\text{F})P(\text{C})\), so the events \(\text{F}\) and \(\text{C}\) are not independent.

    c.

    1. The word 'given' tells you that this is a conditional.
    2. \(P(\text{M|L}) = \dfrac{25}{41}\)
    3. No, the sample space for this problem is the 41 hikers who prefer lakes and streams.

    d.

    1. Find \(P(\text{F})\).
    2. Find \(P(\text{P})\).
    3. Find \(P(\text{F AND P})\).
    4. Find \(P(\text{F OR P})\).

    d.

    1. \(P(\text{F}) = \dfrac{45}{100}\)
    2. \(P(\text{P}) = \dfrac{25}{100}\)
    3. \(P(\text{F AND P}) = \dfrac{11}{100}\)
    4. \(P(\text{F OR P}) = \dfrac{45}{100} + \dfrac{25}{100} - \dfrac{11}{100} = \dfrac{59}{100}\)
    Exercise \(\PageIndex{2}\)

    Table shows a random sample of 200 cyclists and the routes they prefer. Let \(\text{M} =\) males and \(\text{H} =\) hilly path.

    Gender Lake Path Hilly Path Wooded Path Total
    Female 45 38 27 110
    Male 26 52 12 90
    Total 71 90 39 200
    1. Out of the males, what is the probability that the cyclist prefers a hilly path?
    2. Are the events “being male” and “preferring the hilly path” independent events?

    Answer

    1. P(H|M) = \(\dfrac{52}{90}\) = 0.5778
    2. For M and H to be independent, show P(H|M) = P(H)
      P(H|M) = 0.5778, P(H) = \(\dfrac{90}{200}\) = 0.45
      P(H|M) does not equal P(H) so M and H are NOT independent.
    Example \(\PageIndex{3}\)

    Muddy Mouse lives in a cage with three doors. If Muddy goes out the first door, the probability that he gets caught by Alissa the cat is \(\dfrac{1}{5}\) and the probability he is not caught is \(\dfrac{4}{5}\). If he goes out the second door, the probability he gets caught by Alissa is \(\dfrac{1}{4}\) and the probability he is not caught is \(\dfrac{3}{4}\). The probability that Alissa catches Muddy coming out of the third door is \(\dfrac{1}{2}\) and the probability she does not catch Muddy is \(\dfrac{1}{2}\). It is equally likely that Muddy will choose any of the three doors so the probability of choosing each door is \(\dfrac{1}{3}\).

    Door Choice
    Caught or Not Door One Door Two Door Three Total
    Caught \(\dfrac{1}{15}\) \(\dfrac{1}{12}\) \(\dfrac{1}{6}\) ____
    Not Caught \(\dfrac{4}{15}\) \(\dfrac{3}{12}\) \(\dfrac{1}{6}\) ____
    Total ____ ____ ____ 1
    • The first entry \(\dfrac{1}{15} = \left(\dfrac{1}{5}\right) \left(\dfrac{1}{3}\right)\) is \(P(\text{Door One AND Caught})\)
    • The entry \(\dfrac{4}{15} = \left(\dfrac{4}{5}\right) \left(\dfrac{1}{3}\right)\) is \(P(\text{Door One AND Not Caught})\)

    Verify the remaining entries.

    1. Complete the probability contingency table. Calculate the entries for the totals. Verify that the lower-right corner entry is 1.
    2. What is the probability that Alissa does not catch Muddy?
    3. What is the probability that Muddy chooses Door One OR Door Two given that Muddy is caught by Alissa?

    Solution

    Door Choice
    Caught or Not Door One Door Two Door Three Total
    Caught \(\dfrac{1}{15}\) \(\dfrac{1}{12}\) \(\dfrac{1}{6}\) \(\dfrac{19}{60}\)
    Not Caught \(\dfrac{4}{15}\) \(\dfrac{3}{12}\) \(\dfrac{1}{6}\) \(\dfrac{41}{60}\)
    Total \(\dfrac{5}{15}\) \(\dfrac{4}{12}\) \(\dfrac{2}{6}\) 1

    b. \(\dfrac{41}{60}\)

    c. \(\dfrac{9}{19}\)

    Example \(\PageIndex{4}\)

    Table contains the number of crimes per 100,000 inhabitants from 2008 to 2011 in the U.S.

    United States Crime Index Rates Per 100,000 Inhabitants 2008–2011
    Year Robbery Burglary Rape Vehicle Total
    2008 145.7 732.1 29.7 314.7  
    2009 133.1 717.7 29.1 259.2  
    2010 119.3 701 27.7 239.1  
    2011 113.7 702.2 26.8 229.6  
    Total          

    TOTAL each column and each row. Total data = 4,520.7

    1. Find \(P(\text{2009 AND Robbery})\).
    2. Find \(P(\text{2010 AND Burglary})\).
    3. Find \(P(\text{2010 OR Burglary})\).
    4. Find \(P(\text{2011|Rape})\).
    5. Find \(P(\text{Vehicle|2008})\).

    Answer

    a. 0.0294, b. 0.1551, c. 0.7165, d. 0.2365, e. 0.2575

    Exercise \(\PageIndex{3}\)

    Table relates the weights and heights of a group of individuals participating in an observational study.

    Weight/Height Tall Medium Short Totals
    Obese 18 28 14  
    Normal 20 51 28  
    Underweight 12 25 9  
    Totals        
    1. Find the total for each row and column
    2. Find the probability that a randomly chosen individual from this group is Tall.
    3. Find the probability that a randomly chosen individual from this group is Obese and Tall.
    4. Find the probability that a randomly chosen individual from this group is Tall given that the idividual is Obese.
    5. Find the probability that a randomly chosen individual from this group is Obese given that the individual is Tall.
    6. Find the probability a randomly chosen individual from this group is Tall and Underweight.
    7. Are the events Obese and Tall independent?

    Answer

    Weight/Height Tall Medium Short Totals
    Obese 18 28 14 60
    Normal 20 51 28 99
    Underweight 12 25 9 46
    Totals 50 104 51 205
    1. Row Totals: 60, 99, 46. Column totals: 50, 104, 51.
    2. \(P(\text{Tall}) = \dfrac{50}{205} = 0.244\)
    3. \(P(\text{Obese AND Tall}) = \dfrac{18}{205} = 0.088\)
    4. \(P(\text{Tall|Obese}) = \dfrac{18}{60} = 0.3\)
    5. \(P(\text{Obese|Tall}) = \dfrac{18}{50} = 0.36\)
    6. \(P(\text{Tall AND Underweight}) = \dfrac{12}{205} = 0.0585\)
    7. No. \(P(\text{Tall})\) does not equal \(P(\text{Tall|Obese})\).

    References

    1. “Blood Types.” American Red Cross, 2013. Available online at www.redcrossblood.org/learn-a...od/blood-types (accessed May 3, 2013).
    2. Data from the National Center for Health Statistics, part of the United States Department of Health and Human Services.
    3. Data from United States Senate. Available online at www.senate.gov (accessed May 2, 2013).
    4. Haiman, Christopher A., Daniel O. Stram, Lynn R. Wilkens, Malcom C. Pike, Laurence N. Kolonel, Brien E. Henderson, and Loīc Le Marchand. “Ethnic and Racial Differences in the Smoking-Related Risk of Lung Cancer.” The New England Journal of Medicine, 2013. Available online at http://www.nejm.org/doi/full/10.1056/NEJMoa033250 (accessed May 2, 2013).
    5. “Human Blood Types.” Unite Blood Services, 2011. Available online at www.unitedbloodservices.org/learnMore.aspx (accessed May 2, 2013).
    6. Samuel, T. M. “Strange Facts about RH Negative Blood.” eHow Health, 2013. Available online at www.ehow.com/facts_5552003_st...ive-blood.html (accessed May 2, 2013).
    7. “United States: Uniform Crime Report – State Statistics from 1960–2011.” The Disaster Center. Available online at http://www.disastercenter.com/crime/ (accessed May 2, 2013).

    Review

    There are several tools you can use to help organize and sort data when calculating probabilities. Contingency tables help display data and are particularly useful when calculating probabilities that have multiple dependent variables.

    Use the following information to answer the next four exercises. Table shows a random sample of musicians and how they learned to play their instruments.

    Gender Self-taught Studied in School Private Instruction Total
    Female 12 38 22 72
    Male 19 24 15 58
    Total 31 62 37 130
    Exercise 3.5.4

    Find P(musician is a female).

    Exercise 3.5.5

    Find \(P(\text{musician is a male AND had private instruction})\).

    Answer

    \(P(\text{musician is a male AND had private instruction}) = \dfrac{15}{130} = \dfrac{3}{26} = 0.12\)

    Exercise 3.5.6

    Find P(musician is a female OR is self taught).

    Exercise 3.5.7

    Are the events “being a female musician” and “learning music in school” mutually exclusive events?

    Answer

    The events are not mutually exclusive. It is possible to be a female musician who learned music in school.

    Bringing it Together

    Use the following information to answer the next seven exercises. An article in the New England Journal of Medicine, reported about a study of smokers in California and Hawaii. In one part of the report, the self-reported ethnicity and smoking levels per day were given. Of the people smoking at most ten cigarettes per day, there were 9,886 African Americans, 2,745 Native Hawaiians, 12,831 Latinos, 8,378 Japanese Americans, and 7,650 Whites. Of the people smoking 11 to 20 cigarettes per day, there were 6,514 African Americans, 3,062 Native Hawaiians, 4,932 Latinos, 10,680 Japanese Americans, and 9,877 Whites. Of the people smoking 21 to 30 cigarettes per day, there were 1,671 African Americans, 1,419 Native Hawaiians, 1,406 Latinos, 4,715 Japanese Americans, and 6,062 Whites. Of the people smoking at least 31 cigarettes per day, there were 759 African Americans, 788 Native Hawaiians, 800 Latinos, 2,305 Japanese Americans, and 3,970 Whites.

    Exercise 3.5.8

    Complete the table using the data provided. Suppose that one person from the study is randomly selected. Find the probability that person smoked 11 to 20 cigarettes per day.

    Smoking Levels by Ethnicity
    Smoking Level African American Native Hawaiian Latino Japanese Americans White TOTALS
    1–10            
    11–20            
    21–30            
    31+            
    TOTALS            
    Exercise 3.5.9

    Suppose that one person from the study is randomly selected. Find the probability that person smoked 11 to 20 cigarettes per day.

    Answer

    \(\dfrac{35,065}{100,450}\)

    Exercise 3.5.10

    Find the probability that the person was Latino.

    Exercise 3.5.11

    In words, explain what it means to pick one person from the study who is “Japanese American AND smokes 21 to 30 cigarettes per day.” Also, find the probability.

    Answer

    To pick one person from the study who is Japanese American AND smokes 21 to 30 cigarettes per day means that the person has to meet both criteria: both Japanese American and smokes 21 to 30 cigarettes. The sample space should include everyone in the study. The probability is \(\dfrac{4,715}{100,450}\).

    Exercise 3.5.12

    In words, explain what it means to pick one person from the study who is “Japanese American OR smokes 21 to 30 cigarettes per day.” Also, find the probability.

    Exercise 3.5.13

    In words, explain what it means to pick one person from the study who is “Japanese American GIVEN that person smokes 21 to 30 cigarettes per day.” Also, find the probability.

    Answer

    To pick one person from the study who is Japanese American given that person smokes 21-30 cigarettes per day, means that the person must fulfill both criteria and the sample space is reduced to those who smoke 21-30 cigarettes per day. The probability is \(\dfrac{4,715}{15,273}\).

    Exercise 3.5.14

    Prove that smoking level/day and ethnicity are dependent events.

    Glossary

    contingency table
    the method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other; the table provides an easy way to calculate conditional probabilities.

    This page titled 4.6: Contingency Tables is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform.