# 5.30: The Extreme Value Distribution

- Page ID
- 10370

Extreme value distributions arise as limiting distributions for maximums or minimums (extreme values) of a sample of independent, identically distributed random variables, as the sample size increases. Thus, these distributions are important in probability and mathematical statistics.

## The Standard Distribution for Maximums

### Distribution Functions

The standard extreme value distribution (for maximums) is a continuous distribution on \(\R\) with distribution function \( G \) given by \[ G(v) = \exp\left(-e^{-v}\right), \quad v \in \R \]

## Proof

Note that \( G \) is continuous, increasing, and satisfies \( G(v) \to 0 \) as \( v \to -\infty \) and \( G(v) \to 1 \) as \( v \to \infty \).

The distribution is also known as the standard Gumbel distribution in honor of Emil Gumbel. As we will show below, it arises as the limit of the maximum of \(n\) independent random variables, each with the standard exponential distribution (when this maximum is appropriately centered). This fact is the main reason that the distribution is *special*, and is the reason for the name. For the remainder of this discussion, suppose that random variable \( V \) has the standard Gumbel distribution.

The probability density function \( g \) of \( V \) is given by \[ g(v) = e^{-v} \exp\left(-e^{-v}\right) = \exp\left[-\left(e^{-v} + v\right)\right], \quad v \in \R \]

- \(g\) increases and then decreases with mode \( v = 0 \)
- \(g\) is concave upward, then downward, then upward again, with inflection points at \( v = \ln\left[(3 \pm \sqrt{5}) \big/ 2)\right] \approx \pm 0.9264\).

## Proof

These results follow from standard calculus. The PDF is \( g = G^\prime \).

- The first derivative of \( g \) satisfies \(g^\prime(v) = g(v)\left(e^{-v} - 1\right)\) for \( v \in \R \).
- The second derivative of \( g \) satisfies \( g^{\prime \prime}(v) = g(v) \left(e^{-2 v} - 3 e^{-v} + 1\right)\) for \( v \in \R \).

In the special distribution simulator, select the extreme value distribution. Keep the default parameter values and note the shape and location of the probability density function. In particular, note the lack of symmetry. Run the simulation 1000 times and compare the empirical density function to the probability density function.

The quantile function \( G^{-1} \) of \( V \) is given by \[ G^{-1}(p) = -\ln[-\ln(p)], \quad p \in (0, 1) \]

- The first quartile is \(-\ln(-\ln 4) \approx -0.3266\).
- The median is \(-\ln(-\ln 2) \approx 0.3665\)
- The third quartile is \(-\ln(\ln 4 - \ln 3) \approx 1.2459\)

## Proof

The formula for \( G^{-1} \) follows from solving \( p = G(v) \) for \( v \) in terms of \( p \).

In the special distribution calculator, select the extreme value distribution. Keep the default parameter values and note the shape and location of the probability density and distribution functions. Compute the quantiles of order 0.1, 0.3, 0.6, and 0.9

### Moments

Suppose again that \( V \) has the standard Gumbel distribution. The moment generating function of \( V \) has a simple expression in terms of the gamma function \( \Gamma \).

The moment generating function \( m \) of \( V \) is given by \[ m(t) = \E\left(e^{t V}\right) = \Gamma(1 - t), \quad t \in (-\infty, 1) \]

## Proof

Note that \[ m(t) = \int_{-\infty}^\infty e^{t v} \exp\left(-e^{-v}\right) e^{-v} dv \] The substitution \( x = e^{-v} \), \( dx = -e^{-v} dv \) gives \(m(t) = \int_0^\infty x^{-t} e^{-x} dx = \Gamma(1 - t)\) for \(t \in (-\infty, 1)\).

Next we give the mean and variance. First, recall that the Euler constant, named for Leonhard Euler is defined by \[ \gamma = -\Gamma^\prime(1) = -\int_0^\infty e^{-x} \ln x \, dx \approx 0.5772156649 \]

The mean and variance of \( V \) are

- \(\E(V) = \gamma\)
- \(\var(V) = \frac{\pi^2}{6}\)

## Proof

These results follow from the moment generating function.

- \( m^\prime(t) = -\Gamma^\prime(1 - t) \) and so \( \E(V) = m^\prime(0) = - \Gamma^\prime(1) = \gamma \).
- \( m^{\prime \prime}(t) = \Gamma^{\prime \prime}(1 - t) \) and \[ \E(V^2) = m^{\prime \prime}(0) = \Gamma^{\prime \prime}(1) = \int_0^\infty (\ln x)^2 e^{-x} dx = \gamma^2 + \frac{\pi^2}{6} \] Hence \( \var(V) = \E(V^2) - [\E(V)]^2 = \frac{\pi^2}{6} \)

In the special distribution simulator, select the extreme value distribution and keep the default parameter values. Note the shape and location of the mean \( \pm \) standard deviation bar. Run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation.

Next we give the skewness and kurtosis of \( V \). The skewness involves a value of the Riemann zeta function \( \zeta \), named of course for Georg Riemann. Recall that \( \zeta \) is defined by \[ \zeta(n) = \sum_{k=1}^\infty \frac{1}{k^n}, \quad n \gt 1 \]

The skewness and kurtosis of \( V \) are

- \( \skw(V) = 12 \sqrt{6} \zeta(3) \big/ \pi^3 \approx 1.13955 \)
- \( \kur(V) = \frac{27}{5} \)

The particular value of the zeta function, \( \zeta(3) \), is known as Apéry's constant. From (b), it follows that the excess kurtosis is \( \kur(V) - 3 = \frac{12}{5} \).

### Related Distributions

The standard Gumbel distribution has the usual connections to the standard uniform distribution by means of the distribution function and quantile function given above. Recall that the standard uniform distribution is the continuous uniform distribution on the interval \( (0, 1) \).

The standard Gumbel and standard uniform distributions are related as follows:

- If \( U \) has the standard uniform distribution then \( V = G^{-1}(U) = -\ln(-\ln U) \) has the standard Gumbel distribution.
- If \( V \) has the standard Gumbel distribution then \( U = G(V) = \exp\left(e^{-V}\right) \) has the standard uniform distribution.

So we can simulate the standard Gumbel distribution using the usual random quantile method.

Open the random quantile experiment and select the extreme value distribution. Keep the default parameter values and note again the shape and location of the probability density and distribution functions. Run the simulation 1000 times and compare the empirical density function, mean, and standard deviation to their distributional counteparts.

The standard Gumbel distribution also has simple connections with the standard exponential distribution (the exponential distribution with rate parameter 1).

The standard Gumbel and standard exponential distributions are related as follows:

- If \(X\) has the standard exponential distribution then \(V = -\ln X\) has the standard Gumbel distribution.
- If \(V\) has the standard Gumbel distribution then \(X = e^{-V}\) has the standard exponential distribution.

## Proof

These results follow from the usual change of variables theorem. The transformations are \( v = -\ln x \) and \( x = e^{-v} \) for \( x \in (0, \infty) \) and \( v \in \R \), and these are inverses of each other. Let \( f \) and \( g \) denote PDFs of \( X \) and \( V \) respectively.

- We start with \( f(x) = e^{-x} \) for \( x \in (0, \infty) \) and then \[ g(v) = f(x) \left|\frac{dx}{dv}\right| = \exp\left(-e^{-v}\right) e^{-v}, \quad v \in \R \] so \( V \) has the standard Gumbel distribution.
- We start with \( g(v) = \exp\left(-e^{-v}\right) e^{-v} \) for \( v \in \R \) and then \[ f(x) = g(v) \left|\frac{dv}{dx}\right| = \exp\left[-\exp(\ln x)\right] \exp(\ln x) \frac{1}{x} = e^{-x}, \quad x \in (0, \infty) \] so \( X \) has the standard exponential distribution.

As noted in the introduction, the following theorem provides the motivation for the name *extreme value distribution*.

Suppose that \( (X_1, X_2, \ldots) \) is a sequence of independent random variables, each with the standard exponential distribution. The distribution of \(Y_n = \max\{X_1, X_2, \ldots, X_n\} - \ln n \) converges to the standard Gumbel distribution as \( n \to \infty \).

## Proof

Let \( X_{(n)} = \max\{X_1, X_2, \ldots, X_n\} \), so that \( X_{(n)} \) is the \( n \)th order statistics of the random sample \( (X_1, X_2, \ldots, X_n) \). Let \( G \) denote the standard exponential CDF, so that \( G(x) = 1 - e^{-x} \) for \( x \in [0, \infty) \). Note that \( X_{(n)} \) has CDF \( G^n \). Let \( F_n \) denote the CDF of \( Y_n \). For \( x \in \R \) \[ F_n(x) = \P(Y_n \le x) = \P\left[X_{(n)} \le x + \ln n\right] = G^n(x + \ln n) = \left[1 - e^{-(x + \ln n)}\right]^n = \left(1 - \frac{e^{-x}}{n} \right)^n \] By a famous limit from calculus, \( F_n(x) \to e^{-e^{-x}} \) as \( n \to \infty \).

## The General Extreme Value Distribution

As with many other distributions we have studied, the standard extreme value distribution can be generalized by applying a linear transformation to the standard variable. First, if \( V \) has the standard Gumbel distribution (the standard extreme value distribution for maximums), then \( -V \) has the standard extreme value distribution for minimums. Here is the general definition.

Suppose that \(V\) has the standard Gumbel distribution, and that \( a, \, b \in \R \) with \( b \ne 0 \). Then \( X = a + b V \) has the extreme value distribution with location parameter \( a \) and scale parameter \( |b| \).

- If \( b \gt 0 \), then the distribution corresponds to maximums.
- If \( b \lt 0 \), then the distribution corresponds to minimums.

So the family of distributions with \( a \in \R \) and \( b \in (0, \infty) \) is a location-scale family associated with the standard distribution for maximums, and the family of distributions with \( a \in \R \) and \( b \in (-\infty, 0) \) is the location-scale family associated with the standard distribution for minimums.. The distributions are also referred to more simply as Gumbel distributions rather than extreme value distributions. The web apps in this project use only the extreme value distributions for maximums. As you will see below, the differences in the distribution for maximums and the distribution for minimums are minor. For the remainder of this discussion, suppose that \( X \) has the form given in the definition.

### Distribution Functions

Lef \( F \) denote the distribution function of \( X \).

- If \( b \gt 0 \) then \[ F(x) = \exp\left[-\exp\left(-\frac{x - a}{b}\right)\right], \quad x \in \R \]
- If \( b \lt 0 \) then \[ F(x) = 1 - \exp\left[-\exp\left(-\frac{x - a}{b}\right)\right], \quad x \in \R \]

## Proof

Let \( G \) denote the CDF of \( V \). Then

- \( F(x) = G\left(\frac{x - a}{b}\right) \) for \( x \in \R \)
- \( F(x) = 1 - G\left(\frac{x - a}{b}\right) \) for \( x \in \R \)

Let \( f \) denote the probability density function of \( X \). Then \[ f(x) = \frac{1}{|b|} \exp\left(-\frac{x - a}{b}\right) \exp\left[-\exp\left(-\frac{x - a}{b}\right)\right], \quad x \in \R \]

## Proof

Let \( g \) denote the PDF of \( V \). By the change of variables formula, \[ f(x) = \frac{1}{|b|} g\left(\frac{x - a}{b}\right), \quad x \in \R\]

Open the special distribution simulator and select the extreme value distribution. Vary the parameters and note the shape and location of the probability density function. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function.

The quantile function \( F^{-1} \) of \( X \) is given as follows

- If \( b \gt 0 \) then \(F^{-1}(p) = a - b \ln(-\ln p)\) for \(p \in (0, 1)\).
- If \( b \lt 0 \) then \( F^{-1}(p) = a - b \ln[-\ln(1 - p)],\) for \(p \in (0, 1) \)

## Proof

Let \( G^{-1} \) denote the quantile function of \( V \). Then

- \( F^{-1}(p) = a + b G^{-1}(p) \) for \( p \in (0, 1) \).
- \( F^{-1}(p) = a - b G^{-1}(1 - p) \) for \( p \in (0, 1) \).

Open the special distribution calculator and select the extreme value distribution. Vary the parameters and note the shape and location of the probability density and distribution functions. For selected values of the parameters, compute a few values of the quantile function and the distribution function.

### Moments

Suppose again that \( X = a + b V \) where \( V \) has the standard Gumbel distribution, and that \( a, \, b \in \R \) with \( b \ne 0 \).

The moment generating function \( M \) of \( X \) is given by \(M(t) = e^{a t} \Gamma(1 - b t)\).

- With domain \( t \in (-\infty, 1 / b) \) if \( b \gt 0 \)
- With domain \( t \in (1 / b, \infty) \) if \( b \lt 0 \)

## Proof

Let \( m \) denote the MGF of \( V \). Then \( M(t) = e^{a t} m(b t) \) for \( b t \lt 1 \)

The mean and variance of \( X \) are

- \(\E(X) = a + b \gamma\)
- \(\var(X) = b^2 \frac{\pi^2}{6}\)

## Proof

These results follow from the mean and variance of \( V \) and basic properties of expected value and variance.

- \( \E(X) = a + b \E(V) \)
- \( \var(X) = b^2 \var(V) \)

Open the special distribution simulator and select the extreme value distribution. Vary the parameters and note the size and location of the mean \( \pm \) standard deviation bar. For selected values of the parameters, run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation.

The skewness of \( X \) is

- \( \skw(X) = 12 \sqrt{6} \zeta(3) \big/ \pi^3 \approx 1.13955 \) if \( b \gt 0 \).
- \( \skw(X) = -12 \sqrt{6} \zeta(3) \big/ \pi^3 \approx -1.13955 \) if \( b \lt 0 \)

## Proof

Recall that skewness is defined in terms of the standard score, and hence is invariant under linear transformations with positive slope. A linear transformation with negative slope changes the sign of the skewness. Hence these results follow from the skewness of \( V \).

The kurtosis of \( X \) is \( \kur(X) = \frac{27}{5} \)

## Proof

Recall that kurtosis is defined in terms of the standard score and is invariant under linear transformations with nonzero slope. Hence this result follows from the kurtosis of \( V \).

Once again, the excess kurtosis is \( \kur(X) - 3 = \frac{12}{5} \).

### Related Distributions

Since the general extreme value distributions are location-scale families, they are trivially closed under linear transformations of the underlying variables (with nonzero slope).

Suppose that \( X \) has the extreme value distribution with parameters \( a, \, b \) with \( b \ne 0 \) and that \( c, \, d \in \R \) with \( d \ne 0 \). Then \( Y = c + d X \) has the extreme value distribution with parameters \( a d + c \) and \( b d \).

## Proof

By definition, we can write \( X = a + b V \) where \( V \) has the standard Gumbel distribution. Hence \( Y = c + d X = (ad + c) + (b d) V \).

Note if \( d \gt 0 \) then \( X \) and \( Y \) have the same association (max, max) or (min, min). If \( d \lt 0 \) then \( X \) and \( Y \) have opposite associations (max, min) or (min, max).

As with the standard Gumbel distribution, the general Gumbel distribution has the usual connections with the standard uniform distribution by means of the distribution and quantile functions. Since the quantile function has a simple closed form, the latter connection leads to the usual random quantile method of simulation. We state the result for maximums.

Suppose that \( a, \, b \in \R \) with \( b \ne 0 \). Let \( F \) denote distribution function and let \( F^{-1} \) denote the quantile function above

- If \( U \) has the standard uniform distribution then \( X = F^{-1}(U) \) has the extreme value distribution with parameters \( a \) and \( b \).
- If \( X \) has the extreme value distribution with parameters \( a \) and \( b \) then \( U = F(X) \) has the standard uniform distribution.

Open the random quantile experiment and select the extreme value distribution. Vary the parameters and note again the shape and location of the probability density and distribution functions. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function, mean, and standard deviation to their distributional counteparts.

The extreme value distribution for maximums has a simple connection to the Weibull distribution, and this generalizes the in connection between the standard Gumbel and exponential distributions above. There is a similar result for the extreme value distribution for minimums.

The extreme value and Weibull distributions are related as follows:

- If \(X\) has the extreme value distribution with parameters \(a \in \R\) and \(b \in (0, \infty)\), then \(Y = e^{-X}\) has the Weibull distribution with shape parameter \(\frac{1}{b}\) and scale parameter \(e^{-a}\).
- If \(Y\) has the Weibull distribution with shape parameter \(k \in (0, \infty)\) and scale parameter \(b \in (0, \infty)\) then \(X = -\ln Y\) has the extreme value distribution with parameters \(-\ln b\) and \(\frac{1}{k}\).

## Proof

As before, these results can be obtained using the change of variables theorem for probability density functions. We give an alternate proof using special forms of the random variables.

- We can write \( X = a + b V \) where \( V \) has the standard Gumbel distribution. Hence \[ Y = e^{-X} = e^{-a} \left(e^{-V}\right)^b \] As shown in above, \( e^{-V} \) has the standard exponential distribution and therefore \( Y \) has the Weibull distribution with shape parameter \( 1/b \) and scale parameter \( e^{-a} \).
- We can write \( Y = b U^{1/k} \) where \( U \) has the standard exponential distribution. Hence \[ X = -\ln Y = -\ln b + \frac{1}{k}(-\ln U) \] As shown in above, \( -\ln U \) has the standard Gumbel distribution and hence \( X \) has the Gumbel distribution with location parameter \( -\ln b \) and scale parameter \( \frac{1}{k} \).