Skip to main content
Statistics LibreTexts

4.4: Negative Binomial Distribution

  • Page ID
    56922
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    The geometric distribution describes the probability of observing the first success on the \(n^{th}\) trial. The is more general: it describes the probability of observing the \(k^{th}\) success on the \(n^{th}\) trial.

    Example \(\PageIndex{1}\)

    Each day a high school football coach tells his star kicker, Brian, that he can go home after he successfully kicks four 35 yard field goals. Suppose we say each kick has a probability \(p\) of being successful. If \(p\) is small – e.g. close to 0.1 – would we expect Brian to need many attempts before he successfully kicks his fourth field goal?

    Solution

    We are waiting for the fourth success (\(k=4\)). If the probability of a success (\(p\)) is small, then the number of attempts (\(n\)) will probably be large. This means that Brian is more likely to need many attempts before he gets \(k=4\) successes. To put this another way, the probability of \(n\) being small is low.

    To identify a negative binomial case, we check four conditions. The first three are common to the binomial distribution.

    Is it negative binomial? Four conditions to check
    1. The trials are independent.
    2. Each trial outcome can be classified as a success or failure.
    3. The probability of a success (\(p\)) is the same for each trial.
    4. The last trial must be a success.
    Exercise \(\PageIndex{1}\)

    Suppose Brian is very diligent in his attempts and he makes each 35 yard field goal with probability \(p=0.8\). Take a guess at how many attempts he would need before making his fourth kick.

    Answer

    One possible answer: since he is likely to make each field goal attempt, it will take him at least 4 attempts but probably not more than 6 or 7.

    Example \(\PageIndex{1}\)

    In yesterday’s practice, it took Brian only 6 tries to get his fourth field goal. Write out each of the possible sequence of kicks.

    Solution

    Because it took Brian six tries to get the fourth success, we know the last kick must have been a success. That leaves three successful kicks and two unsuccessful kicks (we label these as failures) that make up the first five attempts. There are ten possible sequences of these first five kicks, which are shown in Figure 4.11. If Brian achieved his fourth success (\(k=4\)) on his sixth attempt (\(n=6\)), then his order of successes and failures must be one of these ten possible sequences.

    Exercise \(\PageIndex{1}\)

    Each sequence in Figure 4.11 has exactly two failures and four successes with the last attempt always being a success. If the probability of a success is \(p=0.8\), find the probability of the first sequence.

    Answer

    Add texts here. Do not delete this text first.

    Figure 4.11: The ten possible sequences when the fourth successful kick is on the sixth attempt.
    1 2 3 4    
    1 \(F\) \(F\) \(\stackrel{1}{S}\) \(\stackrel{2}{S}\) \(\stackrel{3}{S}\) \(\stackrel{4}{S}\)
    2 \(F\) \(\stackrel{1}{S}\) \(F\) \(\stackrel{2}{S}\) \(\stackrel{3}{S}\) \(\stackrel{4}{S}\)
    3 \(F\) \(\stackrel{1}{S}\) \(\stackrel{2}{S}\) \(F\) \(\stackrel{3}{S}\) \(\stackrel{4}{S}\)
    4 \(F\) \(\stackrel{1}{S}\) \(\stackrel{2}{S}\) \(\stackrel{3}{S}\) \(F\) \(\stackrel{4}{S}\)
    5 \(\stackrel{1}{S}\) \(F\) \(F\) \(\stackrel{2}{S}\) \(\stackrel{3}{S}\) \(\stackrel{4}{S}\)
    6 \(\stackrel{1}{S}\) \(F\) \(\stackrel{2}{S}\) \(F\) \(\stackrel{3}{S}\) \(\stackrel{4}{S}\)
    7 \(\stackrel{1}{S}\) \(F\) \(\stackrel{2}{S}\) \(\stackrel{3}{S}\) \(F\) \(\stackrel{4}{S}\)
    8 \(\stackrel{1}{S}\) \(\stackrel{2}{S}\) \(F\) \(F\) \(\stackrel{3}{S}\) \(\stackrel{4}{S}\)
    9 \(\stackrel{1}{S}\) \(\stackrel{2}{S}\) \(F\) \(\stackrel{3}{S}\) \(F\) \(\stackrel{4}{S}\)
    10 \(\stackrel{1}{S}\) \(\stackrel{2}{S}\) \(\stackrel{3}{S}\) \(F\) \(F\) \(\stackrel{4}{S}\)

    If the probability Brian kicks a 35 yard field goal is \(p=0.8\), what is the probability it takes Brian exactly six tries to get his fourth successful kick? We can write this as

    \[\begin{aligned} &P(\text{it takes Brian six tries to make four field goals}) \\ & \quad = P(\text{Brian makes three of his first five field goals, and he makes the sixth one}) \\ & \quad = P(\text{$1^{st}$ sequence OR $2^{nd}$ sequence OR ... OR $10^{th}$ sequence})\end{aligned}\]

    where the sequences are from Figure [successFailureOrdersForBriansFieldGoals]. We can break down this last probability into the sum of ten disjoint possibilities:

    \[\begin{aligned} &P(\text{$1^{st}$ sequence OR $2^{nd}$ sequence OR ... OR $10^{th}$ sequence}) \\ &\quad = P(\text{$1^{st}$ sequence}) + P(\text{$2^{nd}$ sequence}) + \cdots + P(\text{$10^{th}$ sequence})\end{aligned}\]

    he probability of the first sequence was identified in Guided Practice [probOfEachSeqOfSixTriesToGetFourSuccesses] as 0.0164, and each of the other sequences have the same probability. Since each of the ten sequence has the same probability, the total probability is ten times that of any individual sequence.

    The way to compute this negative binomial probability is similar to how the binomial problems were solved in Section 3. The probability is broken into two pieces:

    \[\begin{aligned} &P(\text{it takes Brian six tries to make four field goals}) \\ &= [\text{Number of possible sequences}] \times P(\text{Single sequence})\end{aligned}\]

    Each part is examined separately, then we multiply to get the final result.

    We first identify the probability of a single sequence. One particular case is to first observe all the failures (\(n-k\) of them) followed by the \(k\) successes:

    \[\begin{aligned} &P(\text{Single sequence}) \\ &= P(\text{$n-k$ failures and then $k$ successes}) \\ &= (1-p)^{n-k} p^{k}\end{aligned}\]

    We must also identify the number of sequences for the general case. Above, ten sequences were identified where the fourth success came on the sixth attempt. These sequences were identified by fixing the last observation as a success and looking for all the ways to arrange the other observations. In other words, how many ways could we arrange \(k-1\) successes in \(n-1\) trials? This can be found using the \(n\) choose \(k\) coefficient but for \(n-1\) and \(k-1\) instead:

    \[\begin{aligned} {n-1 \choose k-1} = \frac{(n-1)!}{(k-1)! \left((n-1) - (k-1)\right)!} = \frac{(n-1)!}{(k-1)! \left(n - k\right)!}\end{aligned}\]

    This is the number of different ways we can order \(k-1\) successes and \(n-k\) failures in \(n-1\) trials. If the factorial notation (the exclamation point) is unfamiliar, see page .

    Negative Binomial Distribution

    Negative binomial distribution The negative binomial distribution describes the probability of observing the \(k^{th}\) success on the \(n^{th}\) trial, where all trials are independent:

    \[\begin{aligned} P(\text{the $k^{th}$ success on the $n^{th}$ trial}) = {n-1 \choose k-1} p^{k}(1-p)^{n-k} \end{aligned}\]

    The value \(p\) represents the probability that an individual trial is a success.

    Example \(\PageIndex{1}\)

    Show using the formula for the negative binomial distribution that the probability Brian kicks his fourth successful field goal on the sixth attempt is 0.164.

    Solution

    The probability of a single success is \(p=0.8\), the number of successes is \(k=4\), and the number of necessary attempts under this scenario is \(n=6\).

    \[\begin{aligned} {n-1 \choose k-1}p^k(1-p)^{n-k}\ =\ \frac{5!}{3!2!} (0.8)^4 (0.2)^2\ =\ 10\times 0.0164\ =\ 0.164\end{aligned}\]

    Exercise \(\PageIndex{1}\)

    The negative binomial distribution requires that each kick attempt by Brian is independent. Do you think it is reasonable to suggest that each of Brian’s kick attempts are independent?

    Answer

    Answers may vary. We cannot conclusively say they are or are not independent. However, many statistical reviews of athletic performance suggests such attempts are very nearly independent.

    Exercise \(\PageIndex{1}\)

    Assume Brian’s kick attempts are independent. What is the probability that Brian will kick his fourth field goal within 5 attempts?

    Answer

    If his fourth field goal (k = 4) is within five attempts, it either took him four or five tries (n = 4 or n = 5). We have p = 0.8 from earlier. Use the negative binomial distribution to compute the probability of n = 4 tries and n = 5 tries, then add those probabilities together:

    \[\begin{aligned}
    & P(n=4 \text { OR } n=5)=P(n=4)+P(n=5) \\
    & \quad=\binom{4-1}{4-1} 0.8^4+\binom{5-1}{4-1}(0.8)^4(1-0.8)=1 \times 0.41+4 \times 0.082=0.41+0.33=0.74
    \end{aligned}\]

    Binomial versus negative binomial

    In the binomial case, we typically have a fixed number of trials and instead consider the number of successes. In the negative binomial case, we examine how many trials it takes to observe a fixed number of successes and require that the last observation be a success.

    Exercise \(\PageIndex{1}\)

    On 70% of days, a hospital admits at least one heart attack patient. On 30% of the days, no heart attack patients are admitted. Identify each case below as a binomial or negative binomial case, and compute the probability.

    1. What is the probability the hospital will admit a heart attack patient on exactly three days this week?
    2. What is the probability the second day with a heart attack patient will be the fourth day of the week?
    3. What is the probability the fifth day of next month will be the first day with a heart attack patient?
    Answer

    In each part, p = 0.7.

    1. (a) The number of days is fixed, so this is binomial. The parameters are k = 3 and n = 7: 0.097.
    2. The last “success” (admitting a heart attack patient) is fixed to the last day, so we should apply the negative binomial distribution. The parameters are k = 2, n = 4: 0.132.
    3. This problem is negative binomial with k = 1 and n = 5: 0.006. Note that the negative binomial case when k = 1 is the same as using the geometric distribution.

    This page titled 4.4: Negative Binomial Distribution is shared under a CC BY-SA 3.0 license and was authored, remixed, and/or curated by David Diez, Christopher Barr, & Mine Çetinkaya-Rundel via source content that was edited to the style and standards of the LibreTexts platform.