# 5.11: Hypergeometric Distribution

- Page ID
- 2367

Learning Objectives

- To study the use of hypergeometric distribution

The hypergeometric distribution is used to calculate probabilities when *sampling without replacement*. For example, suppose you first randomly sample one card from a deck of \(52\). Then, without putting the card back in the deck you sample a second and then (again without replacing cards) a third. Given this sampling procedure, what is the probability that exactly two of the sampled cards will be aces (\(4\) of the \(52\) cards in the deck are aces). You can calculate this probability using the following formula based on the hypergeometric distribution:

\[ p =\dfrac{ (_{k}C_{x})(_{(n-k)}C_{(n-x)}) }{ _{n}C_{n}}\]

where

- \(k\) is the number of "successes" in the population
- \(x\) is the number of "successes" in the sample
- \(N\) is the size of the population
- \(n\) is the number sampled
- \(p\) is the probability of obtaining exactly \(x\) successes
- \(_kC_x\) is the number of combinations of \(k\) things taken \(x\) at a time

In this example, \(k = 4\) because there are four aces in the deck, \(x = 2\) because the problem asks about the probability of getting two aces, \(N = 52\) because there are \(52\) cards in a deck, and \(n = 3\) because \(3\) cards were sampled. Therefore,

\[\begin{align} p &=\dfrac{(_4C_2) (_{(52-4)}C_{(3-2})}{_{52}C_3} \\[5pt] &= \dfrac{\dfrac{4!}{2!2!}\dfrac{48!}{47!1!}}{\dfrac{52!}{49!3!}} = 0.013 \end{align}\]

The mean and standard deviation of the hypergeometric distribution are:

\[mean = \dfrac{n\,k}{N}\]

\[\sigma_{hypergeometric} = \sqrt{\dfrac{n\,k(N-k)(N-m)}{N^2(N-1)}}\]