Skip to main content
Statistics LibreTexts

5.11: Hypergeometric Distribution

  • Page ID
    2367
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Learning Objectives

    • To study the use of hypergeometric distribution

    The hypergeometric distribution is used to calculate probabilities when sampling without replacement. For example, suppose you first randomly sample one card from a deck of \(52\). Then, without putting the card back in the deck you sample a second and then (again without replacing cards) a third. Given this sampling procedure, what is the probability that exactly two of the sampled cards will be aces (\(4\) of the \(52\) cards in the deck are aces). You can calculate this probability using the following formula based on the hypergeometric distribution:

    \[ p =\dfrac{ (_{k}C_{x})(_{(N-k)}C_{(n-x)}) }{ _{N}C_{n}}\]

    where

    • \(k\) is the number of "successes" in the population
    • \(x\) is the number of "successes" in the sample
    • \(N\) is the size of the population
    • \(n\) is the number sampled
    • \(p\) is the probability of obtaining exactly \(x\) successes
    • \(_kC_x\) is the number of combinations of \(k\) things taken \(x\) at a time

    In this example, \(k = 4\) because there are four aces in the deck, \(x = 2\) because the problem asks about the probability of getting two aces, \(N = 52\) because there are \(52\) cards in a deck, and \(n = 3\) because \(3\) cards were sampled. Therefore,

    \[\begin{align} p &=\dfrac{(_4C_2) (_{(52-4)}C_{(3-2)})}{_{52}C_3} \\[5pt] &= \dfrac{\dfrac{4!}{2!2!}\dfrac{48!}{47!1!}}{\dfrac{52!}{49!3!}} = 0.013 \end{align}\]

    The mean and standard deviation of the hypergeometric distribution are:

    \[mean = \dfrac{n\,k}{N}\]

    \[\sigma_{hypergeometric} = \sqrt{\dfrac{n\,k(N-k)(N-m)}{N^2(N-1)}}\]


    This page titled 5.11: Hypergeometric Distribution is shared under a Public Domain license and was authored, remixed, and/or curated by David Lane via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.