# 14.3: The Gamma Distribution

- Page ID
- 10268

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)## Basic Theory

We now know that the sequence of inter-arrival times \(\bs{X} = (X_1, X_2, \ldots)\) in the Poisson process is a sequence of independent random variables, each having the exponential distribution with rate parameter \(r\), for some \(r \gt 0\). No other distribution gives the strong renewal assumption that we want: the property that the process probabilistically restarts, independently of the past, at each arrival time and at each fixed time.

The \(n\)th arrival time is simply the sum of the first \(n\) inter-arrival times: \[ T_n = \sum_{i=0}^n X_i, \quad n \in \N \] Thus, the sequence of arrival times \(\bs{T} = (T_0, T_1, \ldots)\) is the partial sum process associated with the sequence of inter-arrival times \(\bs{X} = (X_1, X_2, \ldots)\).

### Distribution Functions

Recall that the common probability density function of the inter-arrival times is \[ f(t) = r e^{-r t}, \quad 0 \le t \lt \infty \] Our first goal is to describe the distribution of the \( n \)th arrival \( T_n \).

For \(n \in \N_+\), \(T_n\) has a continuous distribution with probability density function \( f_n \) given by \[ f_n(t) = r^n \frac{t^{n-1}}{(n - 1)!} e^{-r t}, \quad 0 \le t \lt \infty\]

- \( f_n \) increases and then decreases, with mode at \( (n - 1) / r \).
- \( f_1 \) is concave upward. \( f_2 \) is concave downward and then upward, with inflection point at \( t = 2 / r \). For \( n \ge 2 \), \( f_n \) is concave upward, then downward, then upward again with inflection points at \( t = \left[(n - 1) \pm \sqrt{n - 1}\right] \big/ r \).

## Proof

Since \(T_n\) is the sum of \(n\) independent variables, each with PDF \(f\), the PDF of \(T_n\) is the convolution power of \(f\) of order \(n\). That is, \(f_n = f^{*n}\). A simple induction argument shows that \(f_n\) has the form given above. For example, \[ f_2(t) = \int_0^t f(s) f(t - s) \, ds = \int_0^t r e^{-r s} r e^{-r(t-s)} \, ds = \int_0^t r^2 e^{-r t} \, ds = r^2 t \, e^{-r t}, \quad 0 \le t \lt \infty \] Parts (a) and (b) follow from standard calculus.

The distribution with this probability density function is known as the gamma distribution with shape parameter \(n\) and rate parameter \(r\). It is lso known as the Erlang distribution, named for the Danish mathematician Agner Erlang. Again, \(1 / r\) is the scale parameter, and that term will be justified below. The term *shape parameter* for \( n \) clearly makes sense in light of parts (a) and (b) of the last result. The term *rate parameter* for \( r \) is inherited from the inter-arrival times, and more generally from the underlying Poisson process itself: the random points are arriving at an average rate of \( r \) per unit time. A more general version of the gamma distribution, allowing non-integer shape parameters, is studied in the chapter on Special Distributions. Note that since the arrival times are continuous, the probability of an arrival at any given instant of time is 0.

In the gamma experiment, vary \(r\) and \(n\) with the scroll bars and watch how the shape of the probability density function changes. For various values of the parameters, run the experiment 1000 times and compare the empirical density function to the true probability density function.

The distribution function and the quantile function of the gamma distribution do not have simple, closed-form expressions. However, it's easy to write the distribution function as a sum.

For \( n \in \N_+ \), \( T_n \) has distribution function \( F_n \) given by \[ F_n(t) = 1 - \sum_{k=0}^{n-1} e^{-r t} \frac{(r t)^k}{k!}, \quad t \in [0, \infty) \]

## Proof

Note that \[ F_n(t) = \int_0^t f_n(s) ds = \int_0^n r^n \frac{s^{n-1}}{(n - 1)!} e^{-r s} \] The result follows by repeated integration by part.

Open the special distribution calculator, select the gamma distribution, and select CDF view. Vary the parameters and note the shape of the distribution and quantile functions. For selected values of the parameters, compute the quartiles.

### Moments

The mean, variance, and moment generating function of \(T_n\) can be found easily from the representation as a sum of independent exponential variables.

The mean and variance of \(T_n\) are.

- \(\E\left(T_n\right) = n / r\)
- \(\var\left(T_n\right) = n / r^2\)

## Proof

Recall that the exponential distribution with rate parameter \(r\) has mean \(1 / r\) and variance \(1 / r^2\).

- The expected value of a sum is the sum of the expected values, so \(\E\left(T_n\right) = n / r\).
- The variance of a sum of
*independent*variables is the sum of the variances, so \(\var\left(T_n\right) = n / r^2\).

For \(k \in \N\), the moment of order \(k\) of \(T_n\) is \[ \E\left(T_n^k\right) = \frac{(k + n - 1)!}{(n - 1)!} \frac{1}{r^k} \]

## Proof

Using the standard change of variables theorem, \[ \E\left(T_n^k\right) = \int_0^\infty t^k f_n(t) \, dt = \frac{r^{n-1}}{(n - 1)!} \int_0^\infty t^{k + n - 1} r e^{-r t} \, dt \] But the integral on the right is the moment of order \(k + n - 1\) for the exponential distribution, which we showed in the last section is \((k + n - 1)! \big/ r^{k + n - 1}\). Simplifying gives the result.

More generally, the moment of order \(k \gt 0\) (not necessarily an integer) is \[ \E\left(T_n^k\right) = \frac{\Gamma(k + n)}{\Gamma(n)} \frac{1}{r^k} \] where \(\Gamma\) is the gamma function.

In the gamma experiment, vary \(r\) and \(n\) with the scroll bars and watch how the size and location of the mean\( \pm \)standard deviation bar changes. For various values of \(r\) and \(n\), run the experiment 1000 times and compare the empirical moments to the true moments.

Our next result gives the skewness and kurtosis of the gamma distribution.

The skewness and kurtosis of \( T_n \) are

- \( \skw(X) = \frac{2}{\sqrt{n}} \)
- \( \kur(X) = 3 + \frac{6}{n} \)

## Proof

These results follows from the moment results above and the computational formulas for skewness and kurtosis.

In particular, note that the gamma distribution is positively skewed but \( \skw(X) \to 0 \) and as \( n \to \infty \). Recall also that the excess kurtosis is \( \kur(T_n) - 3 = \frac{6}{n} \to 0 \) as \( n \to \infty \). This result is related to the convergence of the gamma distribution to the normal, discussed below. Finally, note that the skewness and kurtosis do not depend on the rate parameter \( r \). This is because, as we show below, \( 1 / r \) is a scale parameter.

The moment generating function of \(T_n\) is \[ M_n(s) = \E\left(e^{s T_n}\right) = \left(\frac{r}{r - s}\right)^n, \quad -\infty \lt s \lt r \]

## Proof

Recall that the MGF of a sum of independent variables is the product of the corresponding MGFs. We showed in the last section that the exponential distribution with parameter \(r\) has MGF \(s \mapsto r / (r - s)\) for \(-\infty \lt s \lt r\).

The moment generating function can also be used to derive the moments of the gamma distribution given above—recall that \(M_n^{(k)}(0) = \E\left(T_n^k\right)\).

### Estimating the Rate

In many practical situations, the rate \(r\) of the process in unknown and must be estimated based on data from the process. We start with a natural estimate of the scale parameter \(1 / r\). Note that \[ M_n = \frac{T_n}{n} = \frac{1}{n} \sum_{i=1}^n X_i \] is the sample mean of the first \(n\) inter-arrival times \((X_1, X_2, \ldots, X_n)\). In statistical terms, this sequence is a random sample of size \(n\) from the exponential distribution with rate \(r\).

\(M_n\) satisfies the following properties:

- \(\E(M_n) = \frac{1}{r}\)
- \(\var(M_n) = \frac{1}{n r^2}\)
- \(M_n \to \frac{1}{r}\) as \(n \to \infty\) with probability 1

## Proof

Parts (a) and (b) follow from the expected value of \(T_n\) and standard properties. Part (c) is the strong law of large numbers.

In statistical terms, part (a) means that \(M_n\) is an unbiased estimator of \(1 / r\) and hence the variance in part (b) is the mean square error. Part (b) means that \(M_n\) is a consistent estimator of \(1 / r\) since \(\var(M_n) \downarrow 0\) as \(n \to \infty\). Part (c) is a stronger from of consistency. In general, the sample mean of a random sample from a distribution is an unbiased and consistent estimator of the distribution mean. On the other hand, a natural estimator of \(r\) itself is \(1 / M_n = n / T_n\). However, this estimator is positively biased.

\(\E(n / T_n) \ge r\).

## Proof

This follows immediately from Jensen's inequality since \(x \mapsto 1 / x\) is concave upward on \((0, \infty)\).

## Properties and Connections

### Scaling

As noted above, the gamma distribution is a scale family.

Suppose that \( T \) has the gamma distribution with rate parameter \( r \in (0, \infty) \) and shape parameter \( n \in \N_+ \). If \( c \in (0, \infty) \) then \( c T \) has the gamma distribution with rate parameter \( r / c \) and shape parameter \( n \).

## Proof

The moment generating function of \( c T \) is \[ \E[e^{s (c T)}] = \E[e^{(c s) T}] = \left(\frac{r}{r - cs}\right)^n = \left(\frac{r / c}{r / c - s}\right)^n, \quad s \lt \frac{r}{c} \]

The scaling property also follows from the fact that the gamma distribution governs the arrival times in the Poisson process. A time change in a Poisson process clearly does not change the strong renewal property, and hence results in a new Poisson process.

### General Exponential Family

The gamma distribution is also a member of the general exponential family of distributions.

Suppose that \( T \) has the gamma distribution with shape parameter \( n \in \N_+ \) and rate parameter \( r \in (0, \infty) \). Then \( T \) has a two parameter general exponential distribution with natural parameters \( n - 1 \) and \( -r \), and natural statistics \( \ln(T) \) and \( T \).

## Proof

This follows from the form of the PDF and the definition of the general exponential family: \[ f(t) = r^n \frac{t^{n-1}}{(n - 1)!} e^{-r t} = \frac{r^n}{(n - 1)!} \exp\left[(n - 1) \ln(t) - r t\right], \quad t \in (0, \infty) \]

### Increments

A number of important properties flow from the fact that the sequence of arrival times \(\bs{T} = (T_0, T_1, \ldots)\) is the partial sum process associated with the sequence of independent, identically distributed inter-arrival times \(\bs{X} = (X_1, X_2, \ldots)\).

The arrival time sequence \(\bs{T}\) has stationary, independent increments:

- If \(m \lt n\) then \(T_n - T_m\) has the same distribution as \(T_{n-m}\), namely the gamma distribution with shape parameter \(n - m\) and rate parameter \(r\).
- If \(n_1 \lt n_2 \lt n_3 \lt \cdots\) then \(\left(T_{n_1}, T_{n_2} - T_{n_1}, T_{n_3} - T_{n_2}, \ldots\right)\) is an independent sequence.

## Proof

The stationary and independent increments properties hold for any partial sum process associated with an independent, identically distributed sequence.

Of course, the stationary and independent increments properties are related to the fundamental renewal

assumption that we started with. If we fix \(n \in \N_+\), then \((T_n - T_n, T_{n+1} - T_n, T_{n+2} - T_n, \ldots)\) is independent of \((T_1, T_2, \ldots, T_n)\) and has the same distribution as \((T_0, T_1, T_2, \ldots)\). That is, if we restart the clock

at time \(T_n\), then the process in the future looks just like the original process (in a probabilistic sense) and is indpendent of the past. Thus, we have our second characterization of the Poisson process.

A process of random points in time is a Poisson process with rate \( r \in (0, \infty) \) if and only if the arrival time sequence \( \bs{T} \) has stationary, independent increments, and for \( n \in \N_+ \), \( T_n \) has the gamma distribution with shape parameter \( n \) and rate parameter \( r \).

### Sums

The gamma distribution is closed with respect to sums of independent variables, as long as the rate parameter is fixed.

Suppose that \(V\) has the gamma distribution with shape parameter \(m \in \N_+\) and rate parameter \(r \gt 0\), \(W\) has the gamma distribution with shape parameter \(n \in \N_+\) and rate parameter \(r\), and that \(V\) and \(W\) are independent. Then \(V + W\) has the gamma distribution with shape parameter \(m + n\) and rate parameter \(r\).

## Proof

There are at least three different proofs of this fundamental result. Perhaps the best is a probabilistic proof based on the Poisson process. We start with an IID sequence \(\bs{X}\) of independent exponentially distributed variables, each with rate parameter \(r\). Then we can associate \(V\) with \(T_m\) and \(W\) with \(T_{m + n} - T_m\) so that \(V + W\) becomes \(T_{m + n}\). The result now follows from the previous theorem.

Another simple proof uses moment generating functions. Recall again that the MGF of \(V + W\) is the product of the MGFs of \(V\) and of \(W\). A third, analytic proof uses convolution. Recall again that the PDF of \(V + W\) is the convolution of the PDFs of \(V\) and of \(W\).

### Normal Approximation

In the gamma experiment, vary \(r\) and \(n\) with the scroll bars and watch how the shape of the probability density function changes. Now set \(n = 10\) and for various values of \(r\) run the experiment 1000 times and compare the empirical density function to the true probability density function.

Even though you are restricted to relatively small values of \(n\) in the app, note that the probability density function of the \(n\)th arrival time becomes more bell shaped as \(n\) increases (for \(r\) fixed). This is yet another application of the central limit theorem, since \(T_n\) is the sum of \(n\) independent, identically distributed random variables (the inter-arrival times).

The distribution of the random variable \(Z_n\) below converges to the standard normal distribution as \(n \to \infty\): \[ Z_n = \frac{r\,T_n - n}{\sqrt{n}} \]

## Proof

\(Z_n\) is the standard score associated with \(T_n\), so the result follows from the central limit theorem.

### Connection to Bernoulli Trials

We return to the analogy between the Bernoulli trials process and the Poisson process that started in the Introduction and continued in the last section on the Exponential Distribution. If we think of the successes in a sequence of Bernoulli trials as random points in discrete time, then the process has the same strong renewal property as the Poisson process, but restricted to discrete time. That is, at each fixed time and at each arrival time, the process starts over

, independently of the past. In Bernoulli trials, the time of the \( n \)th arrival has the negative binomial distribution with parameters \( n \) and \( p \) (the success probability), while in the Poisson process, as we now know, the time of the \( n \)th arrival has the gamma distribution with parameters \( n \) and \( r \) (the rate). Because of this strong analogy, we expect a relationship between these two processes. In fact, we have the same type of limit as with the geometric and exponential distributions.

Fix \( n \in \N_+ \) and suppose that for each \( m \in \N_+ \) \( T_{m,n} \) has the negative binomial distribution with parameters \( n \) and \( p_m \in (0, 1) \), where \( m p_m \to r \in (0, \infty) \) as \( m \to \infty \). Then the distribution of \( T_{m,n} \big/ m \) converges to the gamma distribution with parameters \( n \) and \( r \) as \( m \to \infty \).

## Proof

Suppose that \( X_m \) has the geometric distribution on \( \N_+ \) with success parameter \( p_m \). We know from our convergence result in the last section that the distribution of \( X_m / m \) converges to the exponential distribution with rate parameter \( r \) as \( m \to \infty \). It follows that if \( M_m \) denotes the moment generating function of \( X_m / m \), then \( M_m(s) \to r / (r - s) \) as \( m \to \infty \) for \( s \lt r \). But then \( M_m^n \) is the MGF of \( T_{m,n} \big/ m \) and clearly \[ M_m^n(s) \to \left(\frac{r}{r - s}\right)^n \] as \( m \to \infty \) for \( s \lt r \). The expression on the right is the MGF of the gamma distribution with shape parameter \( n \) and rate parameter \( r \).

## Computational Exercises

Suppose that customers arrive at a service station according to the Poisson model, at a rate of \(r = 3\) per hour. Relative to a given starting time, find the probability that the second customer arrives sometime after 1 hour.

## Answer

0.1991

Defects in a type of wire follow the Poisson model, with rate 1 per 100 meter. Find the probability that the 5th defect is located between 450 and 550 meters.

## Answer

0.1746

Suppose that requests to a web server follow the Poisson model with rate \(r = 5\). Relative to a given starting time, compute the mean and standard deviation of the time of the 10th request.

## Answer

2, 0.6325

Suppose that \(Y\) has a gamma distribution with mean 40 and standard deviation 20. Find the shape parameter \(n\) and the rate parameter \(r\).

## Answer

\(r = 1 / 10\), \(n = 4\)

Suppose that accidents at an intersection occur according to the Poisson model, at a rate of 8 per year. Compute the normal approximation to the event that the 10th accident (relative to a given starting time) occurs within 2 years.

## Answer

0.5752

In the gamma experiment, set \(n = 5\) and \(r = 2\). Run the experiment 1000 times and compute the following:

- \(\P(1.5 \le T_t \le 3)\)
- The relative frequency of the event \(\{1.5 \le T_5 \le 3\}\)
- The normal approximation to \(\P(1.5 \le T_5 \le 3)\)

## Answer

- 0.5302
- 0.4871

Suppose that requests to a web server follow the Poisson model. Starting at 12:00 noon on a certain day, the requests are logged. The 100th request comes at 12:15. Estimate the rate of the process.

## Answer

\(r = 6.67\) hits per minute