# 3.4: Hypergeometric, Geometric, and Negative Binomial Distributions

- Page ID
- 3263

In this section, we consider three more families of discrete probability distributions. There are some similarities between the three, which can make them hard to distinguish at times. So throughout this section we will compare the three to each other and the binomial distribution, and point out their differences.

## Hypergeometric Distribution

Consider the following example.

### Example \(\PageIndex{1}\)

An urn contains a total of \(N\) balls, where some number \(m\) of the balls are **orange** and the remaining \(N-m\) are **grey**. Suppose we draw \(n\) balls from the urn without replacement, meaning once we select a ball we do not place it back in the urn before drawing out the next one. Then some of the balls in our selection may be **orange** and some may be **grey**. We can define the discrete random variable \(X\) to give the number of **orange** balls in our selection. The probability distribution of \(X\) is referred to as the *hypergeometric distribution*, which we define next.

### Definition \(\PageIndex{1}\)

Suppose in a collection of \(N\) objects, \(m\) are of type 1 and \(N-m\) are of another type 2. Furthermore, suppose that \(n\) objects are randomly selected from the collection without replacement. Define the discrete random variable \(X\) to give the number of selected objects that are of type 1. Then \(X\) has a * hypergeometric distribution* with parameters \(N, m, n\). The probability mass function of \(X\) is given by

\begin{align*}

p(x) &= P(X=x) = P(x\ \text{type 1 objects &}\ n-x\ \text{type 2}) \notag \\

&= \frac{(\text{# of ways to select}\ x\ \text{type 1 objects from}\ m) \times (\text{# of ways to select}\ n-x\ \text{type 2 objects from}\ N-m)}{\text{total # of ways to select}\ n\ \text{objects of any type from}\ N} \notag \\

&= \frac{\displaystyle{\binom{m}{x}\binom{N-m}{n-x}}}{\displaystyle{\binom{N}{n}}} \label{hyperpmf}

\end{align*}

In some sense, the hypergeometric distribution is similar to the binomial, except that the method of sampling is *crucially* different. In each case, we are interested in the number of times a specific outcome occurs in a set number of repeated trials, where we could consider each selection of an object in the hypergeometric case as a trial. In the binomial case we are interested in the number of "successes" in the trials, and in the hypergeometric case we are interested in the number of a certain type of object being selected, which could be considered a "success". However, the trials in a binomial distribution are independent, while the trials in a hypergeometric distribution are not because the objects are selected without replacement. If, in Example 3.4.1, the balls were drawn *with* replacement, then each draw would be an independent Bernoulli trial and the distribution of \(X\) would be binomial, since the same number of balls in the urn would be the same each time another ball is drawn. However, when the balls are drawn *without* replacement, each draw is not independent, since the number of balls in the urn decreases after each draw as well as the number of balls of a given type.

### Exercise \(\PageIndex{1}\)

Suppose your friend has 10 cookies, 3 of which are chocolate chip. Your friend randomly divides the cookies equally between herself and you. What is the probability that you get all the chocolate chip cookies?

**Answer**-
Let random variable \(X=\) number of chocolate chip cookies you get. Then \(X\) is hypergeometric with \(N=10\) total cookies, \(m=3\) chocolate chip cookes, and \(n=5\) cookies selected by your friend to give to you. We want the probability that you get all the chocolate chip cookies, i.e., \(P(X=3)\), which is

$$P(X=3) = \frac{\displaystyle{\binom{3}{3}\binom{7}{2}}}{\displaystyle{\binom{10}{5}}} = 0.083\notag$$Note that \(X\) has a hypergeometric distribution and not binomial because the cookies are being selected (or divided) without replacement.

## Geometric Distribution & Negative Binomial Distribution

The geometric and negative binomial distributions are related to the binomial distribution in that the underlying probability experiment is the same, i.e., independent trials with two possible outcomes. However, the random variable defined in the geometric and negative binomial case highlights a different aspect of the experiment, namely the number of trials needed to obtain a specific number of "successes". We start with the geometric distribution.

### Definition \(\PageIndex{2}\)

Suppose that a sequence of independent Bernoulli trials is performed, with \(p = P(\text{"success"})\) for each trial. Define the random variable \(X\) to give the number of trial at which the first success occurs. Then \(X\) has a * geometric distribution* with parameter \(p\). The probability mass function of \(X\) is given by

\begin{align}

p(x) = P(X=x) &= P(1^{st}\ \text{success on}\ x^{th}\ \text{trial}) \notag \\

&= P(1^{st}\ (x-1)\ \text{trials are failures &}\ x^{th}\ \text{trial is success}) \notag \\

&= (1-p)^{x-1}p, \quad\text{for}\ x = 1, 2, 3, \ldots \label{geompmf}

\end{align}

### Exercise \(\PageIndex{2}\)

Verify that the pmf for a geometric distribution (Equation \ref{geompmf}) satisfies the two properties for pmf's, i.e.,

- \(p(x) \geq 0\), for \(x=1, 2, 3, \ldots\)
- \(\displaystyle{\sum^{\infty}_{x=1} p(x) = 1}\)
*Hint: It's called "geometric" for a reason!*

**Answer**-
- Note that \(0\leq p\leq 1\), so that we also have \(0\leq (1-p) \leq 1\) and \(0\leq (1-p)^{x-1} \leq 1\), for \(x=1, 2, \ldots\). Thus, it follows that \(p(x) = (1-p)^{x-1}p \geq 0\).
- Recall the formula for the sum of a geometric series:

$$\sum^{\infty}_{x=1} ar^{x-1} = \frac{a}{1-r}, \quad\text{if}\ |r|<1.\notag$$

Note that the sum of the geometric pmf is a geometric series with \(a=p\) and \(r = 1-p < 1\). Thus, we have

$$\sum_{x=1}^{\infty} p(x) = \sum^{\infty}_{x=1} (1-p)^{x-1}p = \frac{p}{1 - (1-p)} = \frac{p}{p} = 1\ \checkmark\notag$$

### Example \(\PageIndex{2}\)

Each of the following is an example of a random variable with the geometric distribution.

- Toss a fair coin until the first heads occurs. In this case, a "success" is getting a heads ("failure" is getting tails) and so the parameter \(p = P(h) = 0.5\).
- Buy lottery tickets until getting the first win. In this case, a "success" is getting a lottery ticket that wins money, and a "failure" is not winning. The parameter \(p\) will depend on the odds of wining for a specific lottery.
- Roll a pair of fair dice until getting the first double 1's. In this case, a "success" is getting double 1's, and a "failure" is simply not getting double 1's (so anything else). To find the parameter \(p\), note that the underlying sample space consists of all possible rolls of a pair of fair dice, of which there are \(6\times6 = 36\) because each die has 6 possible sides. Each of these rolls is equally likely, so $$p = P(\text{double 1's}) = \frac{\text{# of ways to roll double 1's}}{36} = \frac{1}{36}.\notag$$

The negative binomial distribution generalizes the geometric distribution by considering any number of successes.

### Definition \(\PageIndex{3}\)

Suppose that a sequence of independent Bernoulli trials is performed, with \(p = P(\text{"success"})\) for each trial. Fix an integer \(r\) to be greater than or equal to 2 and define the random variable \(X\) to give the number of trial at which the \(r^{th}\) success occurs. Then \(X\) has a * negative binomial distribution* with parameters \(r\) and \(p\). The probability mass function of \(X\) is given by

\begin{align*}

p(x) = P(X=x) &= P(r^{th}\ \text{success is on}\ x^{th}\ \text{trial}) \\

&= \underbrace{P(1^{st}\ (r-1)\ \text{successes in}\ 1^{st}\ (x-1)\ \text{trials})}_{\text{bniomial with}\ n=x-1} \times P(r^{th}\ \text{success on}\ x^{th}\ \text{trial}) \\

&= \binom{x-1}{r-1}p^{r-1}(1-p)^{(x-1)-(r-1)}\times p \\

&= \binom{x-1}{r-1}p^r(1-p)^{x-r}, \quad\text{for}\ x=r, r+1, r+2, \ldots

\end{align*}

### Example \(\PageIndex{3}\)

For examples of the negative binomial distribution, we can alter the geometric examples given in Example 3.4.2.

- Toss a fair coin until get 8 heads. In this case, the parameter \(p\) is still given by \(p = P(h) = 0.5\), but now we also have the parameter \(r = 8\), the number of desired "successes", i.e., heads.
- Buy lottery tickets until win 5 times. In this case, the parameter \(p\) is still given by the odds of winning the lottery, but now we also have the parameter \(r = 5\), the number of desired wins.
- Roll a pair of fair dice until get 100 double 1's. In this case, the parameter \(p\) is still given by \(p = P(\text{double 1's}) = \frac{1}{36}\), but now we also have the parameter \(r = 100\), the number of desired "successes".

In general, note that a geometric distribution can be thought of a negative binomial distribution with parameter \(r=1\).

Note that for both the geometric and negative binomial distributions the number of possible values the random variable can take is infinite. These are still *discrete* distributions though, since we can "list" the values. In other words, the possible values are** countable**. This is in contrast to the Bernoulli, binomial, and hypergeometric distributions, where the number of possible values is finite.

We again note the distinction between the binomial distribution and the geometric and negative binomial distributions. In the binomial distribution, the number of *trials *is fixed, and we count the number of "successes". Whereas, in the geometric and negative binomial distributions, the number of *"successes" *is fixed, and we count the number of trials needed to obtain the desired number of "successes".