5.25: The Irwin-Hall Distribution
The Irwin-Hall distribution , named for Joseph Irwin and Phillip Hall, is the distribution that governs the sum of independent random variables, each with the standard uniform distribution. It is also known as the uniform sum distribution . Since the standard uniform is one of the simplest and most basic distributions (and corresponds in computer science to a random number ), the Irwin-Hall is a natural family of distributions. It also serves as a nice example of the central limit theorem, conceptually easy to understand.
Basic Theory
Definition
Suppose that \( \bs{U} = (U_1, U_2, \ldots) \) is a sequence of indpendent random variables, each with the uniform distribution on the interval \( [0, 1] \) (the standard uniform distribution ). For \( n \in \N_+ \), let \[ X_n = \sum_{i=1}^n U_i \] Then \( X_n \) has the Irwin-Hall distribution of order \( n \).
So \( X_n \) has a continuous distribution on the interval \( [0, n] \) for \( n \in \N_+ \).
Distribution Functions
Let \( f \) denote the probability density function of the standard uniform distribution, so that \( f(x) = 1 \) for \( 0 \le x \le 1 \) (and is 0 otherwise). It follows immediately that the probability density function \( f_n \) of \( X_n \) satisfies \( f_n = f^{*n} \), where of course \( f^{*n} \) is the \( n \)-fold convolution power of \( f \). We can compute \( f_2 \) and \( f_3 \) by hand.
The probability density function \( f_2 \) of \( X_2 \) is given by \[ f_2(x) = \begin{cases} x, & 0 \le x \le 1 \\ x - 2 (x - 1), & 1 \le x \le 2 \end{cases} \]
Proof
Note that \( X_2 \) takes values in \( [0, 2] \) and \( f_2(x) = \int_\R f(u) f(x - u) \, du \) for \( x \in [0, 2] \). The integral reduces to \( \int_0^x 1 \, du = x \) for \( 0 \le x \le 1 \) and the integral reduces to \( \int_{x-1}^1 1 \, du = 2 - x \) for \( 1 \le x \le 2 \).
Note that the graph of \( f_2 \) on \( [0, 2] \) consists of two lines, pieced together in a continuous way at \( x = 1 \). The form given above is not the simplest, but makes the continuity clear, and will be helpful when we generalize.
In the special distribution simulator, select the Irwin-Hall distribution and set \( n = 2 \). Note the shape of the probability density function. Run the simulation 1000 times and compare the empirical density function to the probability density function.
The probability density function \( f_3 \) of \( X_3 \) is given by \[ f_3(x) = \begin{cases} \frac{1}{2} x^2, & 0 \le x \le 1 \\ \frac{1}{2} x^2 - \frac{3}{2}(x - 1)^2, & 1 \le x \le 2 \\ \frac{1}{2} x^2 - \frac{3}{2}(x - 1)^2 + \frac{3}{2}(x - 2)^2, & 2 \le x \le 3 \end{cases} \]
Note that the graph of \( f_3 \) on \( [0, 3] \) consists of three parabolas pieced together in a continuous way at \( x = 1 \) and \( x = 2 \). The expressions for \( f_3(x) \) for \( 1 \le x \le 2 \) and \( 2 \le x \le 3 \) can be expanded and simplified, but again the form given above makes the continuity clear, and will be helpful when we generalize.
In the special distribution simulator, select the Irwin-Hall distribution and set \( n = 3 \). Note the shape of the probability density function. Run the simulation 1000 times and compare the empirical density function to the probability density function.
Naturally, we don't want to perform the convolutions one at a time; we would like a general formula. To state the formula succinctly, we need to recall the floor function : \[ \lfloor x \rfloor = \max\{ n \in \Z: n \le x\}, \quad x \in \R \] so that \( \lfloor x \rfloor = j \) if \( j \in \Z \) and \( j \le x \lt j + 1 \).
For \( n \in \N_+ \), the probability density function \( f_n \) of \( X_n \) is given by \[ f_n(x) = \frac{1}{(n - 1)!} \sum_{k=0}^{\lfloor x \rfloor} (-1)^k \binom{n}{k} (x - k)^{n-1}, \quad x \in \R \]
Proof
Let \( f_n \) denote the function given by the formula above. Clearly \( X_n \) takes values in \( [0, n] \), so first let's note that \( f_n \) gives the correct value outside of this interval. If \( x \lt 0 \), the sum is over an empty index set and hence is 0. Suppose \( x \gt n \). Since \( \binom{n}{k} = 0 \) for \( k \gt n \), we have \[ f_n(x) = \frac{1}{(n - 1)!} \sum_{k=0}^n (-1)^k \binom{n}{k} (x - k)^{n-1}, \quad x \in \R \] Using the binomial theorem, \begin{align*} \sum_{k=0}^n (-1)^k \binom{n}{k} (x - k)^{n-1} & = \sum_{k=0}^n (-1)^k \binom{n}{k} \sum_{j=0}^{n-1} \binom{n - 1}{j} x^j (-k)^{n - 1 - j} \\ & = \sum_{j=0}^{n-1} (-1)^{n - 1 - j} \binom{n - 1}{j} x^j \sum_{k=0}^n (-1)^k \binom{n}{k} k^{n - 1 - j} \end{align*} The second sum in the last expression is 0 for \( j \in \{0, 1, \ldots n - 1\} \) by the alternating series identity for binomial coefficients. We will see this identity again.
To show that the formula is correct on \( [0, n] \) we use induction on \( n \). Suppose that \( n = 1 \). If \( 0 \lt x \lt 1 \), then \( \lfloor x \rfloor = 0 \) so \[ f_1(x) = \frac{1}{0!} (-1)^0 \binom{1}{0} x^0 = 1 = f(x) \] Suppose now that the formula is correct for a given \( n \in \N_+ \). We need to show that \( f_n * f = f_{n+1} \). Note that \[ (f_n * f)(x) = \int_\R f_n(y) f (x - y) d y = \int_{x-1}^x f_n(y) dy \] As often with convolutions, we must take cases. Suppose that \( j \le x \lt j + 1 \) where \( j \in \{0, 1, \ldots, n\} \). Then \[ (f_n * f)(x) = \int_{x-1}^x f_n(y) dy = \int_{x-1}^j f_n(y) dy + \int_j^x f_n(y) dy \] Substituting the formula for \( f_n(y) \) and integrating gives \begin{align*} & \int_{x-1}^j f_n(y) dy = \frac{1}{n!} \sum_{k=0}^{j-1} (-1)^k \binom{n}{k}(j - k)^n - \frac{1}{n!} \sum_{k=0}^{j-1} (-1)^k \binom{n}{k}(x - 1 - k)^n \\ & \int_j^x f_n(y) dy = \frac{1}{n!} \sum_{k=0}^j (-1)^k \binom{n}{k} (x - k)^n - \frac{1}{n!} \sum_{k=0}^j (-1)^k \binom{n}{k}(j - k)^n \end{align*} Adding these together, note that the first sum in the first equation cancels the second sum in the second equation. Re-indexing the second sum in the first equation we have \[ (f_n * f)(x) = \frac{1}{n!}\sum_{k=1}^j (-1)^k \binom{n}{k - 1}(x - k)^n + \frac{1}{n!} \sum_{k=0}^n (-1)^k \binom{n}{k} (x - k)^n \] Finally, using the famous binomial identity \( \binom{n}{k - 1} + \binom{n}{k} = \binom{n+1}{k} \) for \( k \in \{1, 2, \ldots n\} \) we have \[ (f_n * f)(x) = \frac{1}{n!} \sum_{k=0}^j (-1)^k \binom{n+1}{k} (x - k)^n = f_{n+1}(x) \]
Note that for \( n \in \N_+ \), the graph of \( f_n \) on \( [0, n] \) consists of \( n \) polynomials of degree \( n - 1 \) pieced together in a continuous way. Such a construction is known as a polynomial spline . The points where the polynomials are connected are known as knots . So \( f_n \) is a polynomial spline of degree \( n - 1 \) with knots at \( x \in \{1, 2, \ldots, n - 1\} \). There is another representation of \( f_n \) as a sum. To state this one succinctly, we need to recall the sign function : \[ \sgn(x) = \begin{cases} -1, & x \lt 0 \\ 0, & x = 0 \\ 1, & x \gt 0 \end{cases} \]
For \( n \in \N_+ \), the probability density function \( f_n \) of \( X_n \) is given by \[ f_n(x) = \frac{1}{2 (n - 1)!} \sum_{k=0}^n (-1)^k \binom{n}{k} \sgn(x - k) (x - k)^{n-1}, \quad x \in \R \]
Direct Proof
Let \( g_n \) denote the function defined in the theorem. We will show directly that \( g_n = f_n \), the probability density function given in the previous theorem . Suppose that \( j \le x \lt j + 1 \), so that \( \lfloor x \rfloor = j \). Note that \( \sgn(x - k) = 1 \) for \( k \lt j \) and \( \sgn(x - k) = - 1 \) for \( k \gt j \). Hence \[ g_n(x) = \frac{1}{2(n - 1)!} \sum_{k=0}^j (-1)^k \binom{n}{k} (x - k)^{n-1} - \frac{1}{2(n - 1)!} \sum_{k=j+1}^n (-1)^k \binom{n}{k} (x - k)^{n-1} \] Adding and subtracting a copy of the first term gives \begin{align*} g_n(x) & = \frac{1}{(n - 1)!} \sum_{k=0}^j (-1)^k \binom{n}{k} (x - k)^{n-1} - \frac{1}{2(n - 1)!} \sum_{k=0}^n (-1)^k \binom{n}{k} (x - k)^{n-1}\\ & = f_n(x) - \frac{1}{2(n - 1)!}\sum_{k=0}^n (-1)^k \binom{n}{k} (x - k)^{n-1} \end{align*} The last sum is identically 0, from the proof of the previous theorem.
Proof by induction
For \( n = 1 \) the displayed formula is \[ \frac{1}{2}[\sgn(x) x^0 - \sgn(x - 1) (x - 1)^0] = \frac{1}{2}[\sgn(x) - \sgn(x - 1)] = \begin{cases} 1, & 0 \lt x \lt 1 \\ 0, & \text{otherwise} \end{cases} \] So the formula is correct for \( n = 1 \). Assume now that the formula is correct for \( n \in \N_+ \). Then \begin{align} f_{n+1}(x) & = (f_n * f)(x) = \int_\R \frac{1}{2(n - 1)!} \sum_{k=0}^n (-1)^k \binom{n}{k} \sgn(u - k) (u - k)^{n-1} f(x - u) \, du \\ & = \frac{1}{2(n - 1)!} \sum_{k=0}^n (-1)^k \binom{n}{k} \int_{x-1}^x \sgn(u - k) (u - k)^{n-1} \, du \end{align} But \( \int_{x-1}^x \sgn(u - k) (u - k)^{n-1} \, du = \frac{1}{n}\left[\sgn(x - k) (x - k)^n - \sgn(x - k - 1) (x - k - 1)^n\right] \) for \( k \in \{0, 1, \ldots, n\} \). So substituting and re-indexing one of the sums gives \[ f_{n+1}(x) = \frac{1}{2 n!} \sum_{k=0}^n (-1)^k \binom{n}{k} \sgn(x - k) (x - k)^n + \frac{1}{2 n!} \sum_{k=1}^{n+1} (-1)^k \binom{n}{k-1} \sgn(x - k) (x - k)^n \] Using the famous identity \( \binom{n}{k} + \binom{n}{k-1} = \binom{n + 1}{k} \) for \( k \in \{1, 2, \ldots, n\} \) we finally get \[ f_{n+1}(x) = \frac{1}{2 n!} \sum_{k=0}^{n+1} (-1)^k \binom{n+1}{k} \sgn(x - k) (x - k)^n \] which verifies the formula for \( n + 1 \).
Open the special distribution simulator and select the Irwin-Hall distribution. Start with \( n = 1 \) and increase \( n \) successively to the maximum \( n = 10 \). Note the shape of the probability density function. For various values of \( n \), run the simulation 1000 times and compare the empirical density function to the probability density function.
For \( n \in \{2, 3, \ldots\} \), the Irwin-Hall distribution is symmetric and unimodal, with mode at \( n / 2 \).
The distribution function \( F_n \) of \( X_n \) is given by \[ F_n(x) = \frac{1}{n!} \sum_{k=0}^{\lfloor x \rfloor} (-1)^k \binom{n}{k} (x - k)^n, \quad x \in [0, n] \]
Proof
This follows from the first form of the PDF and integration.
So \( F_n \) is a polynomial spline of degree \( n \) with knots at \( \{1, 2, \ldots, n - 1\} \). The alternate from of the probability density function leads to an alternate form of the distribution function.
The distribution function \( F_n \) of \( X_n \) is given by \[ F_n(x) = \frac{1}{2} + \frac{1}{2 n!} \sum_{k=0}^n (-1)^k \binom{n}{k} \sgn(x - k) (x - k)^n, \quad x \in [0, n] \]
Proof
The result follws from the second form of the PDF and integration.
The quantile function \( F_n^{-1} \) does not have a simple representation, but of course by symmetry, the median is \( n/2 \).
Open the special distribution calculator and select the Irwin-Hall distribution. Vary \( n \) from 1 to 10 and note the shape of the distribution function. For each value of \( n \) compute the first and third quartiles.
Moments
The moments of the Irwin-Hall distribution are easy to obtain from the representation as a sum of independent standard uniform variables. Once again, we assume that \( X_n \) has the Irwin-Hall distribution of order \( n \in \N_+ \).
The mean and variance of \( X_n \) are
- \( \E(X_n) = n / 2 \)
- \( \var(X_n) = n / 12 \)
Proof
This follows immediately from the representation \( X_n = \sum_{i=1}^n U_i \) where \( \bs U = (U_1, U_2, \ldots) \) is a sequence of independent, standard uniform variables, since \( \E(U_i) = 1/2 \) and \( \var(U_i) = 1/12 \)
Open the special distribution simulator and select the Irwin-Hall distribution. Vary \( n \) and note the shape and location of the mean \( \pm \) standard deviation bar. For selected values of \( n \) run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation.
The skewness and kurtosis of \( X_n \) are
- \( \skw(X_n) = 0 \)
- \( \kur(X_n) = 3 - \frac{6}{5 n} \)
Proof
The fact that the skweness is 0 follows immediately from the symmetry of the distribution (once we know that \( X_n \) has moments of all orders). The kurtosis result follows from the usual formula and the moments of the standard uniform distribution.
Note that \( \kur(X_n) \to 3 \), the kurtosis of the normal distribution, as \( n \to \infty \). That is, the excess kurtosis \( \kur(X_n) - 3 \to 0 \) as \( n \to \infty \).
Open the special distribution simulator and select the Irwin-Hall distribution. Vary \( n \) and note the shape and of the probability density function in light of the previous results on skewness and kurtosis. For selected values of \( n \) run the simulation 1000 times and compare the empirical density function, mean, and standard deviation to their distributional counterparts.
The moment generating function \( M_n \) of \( X_n \) is given by \( M_n(0) = 1 \) and \[ M_n(t) = \left(\frac{e^t - 1}{t}\right)^n, \quad t \in \R \setminus\{0\} \]
Proof
This follows immediately from the representation \( X_n = \sum_{i=1}^n U_i \) where \( \bs{U} = (U_1, U_2, \ldots) \) is a sequence of independent standard uniform variables. Recall that the standard uniform distribution has MGF \( t \mapsto (e^t - 1) / t \), and the MGF of a sum of independent variables is the product of the MGFs.
Related Distributions
The most important connection is to the standard uniform distribution in the definition : The Irwin-Hall distribution of order \( n \in \N_+ \) is the distribution of the sum of \( n \) independent variables, each with the standard uniform distribution. The Irwin-Hall distribution of order 2 is also a triangle distribution:
The Irwin-Hall distribution of order 2 is the triangle distribution with location parameter 0, scale parameter 2, and shape parameter \( \frac{1}{2} \).
Proof
This follows immediately from the PDF \( f_2 \) .
The Irwin-Hall distribution is connected to the normal distribution via the central limit theorem.
Suppose that \( X_n \) has the Irwin-Hall distribution of order \( n \) for each \( n \in \N_+ \). Then the distribution of \[ Z_n = \frac{X_n - n/2}{\sqrt{n/12}} \] converges to the standard normal distribution as \( n \to \infty \).
Proof
This follows immediately from the central limit theorem, since \( X_n = \sum_{i=1}^n U_i \) where \( (U_1, U_2, \ldots) \) is a sequence of independent variables, each with the standard uniform distribution. Note that \( Z_n \) is the standard score of \( X_n \).
Thus, if \( n \) is large, \( X_n \) has approximately a normal distribution with mean \( n/2 \) and variance \( n/12 \).
Open the special distribution simulator and select the Irwin-Hall distribution. Start with \( n = 1 \) and increase \( n \) successively to the maximum \( n = 10 \). Note how the probability density function becomes more
normal
as \( n \) increases. For various values of \( n \), run the simulation 1000 times and compare the empirical density function to the probability density function.
The Irwin-Hall distribution of order \( n \) is trivial to simulate, as the sum of \( n \) random numbers. Since the probability density function is bounded on a bounded support interval, the distribution can also be simulated via the rejection method. Computationally, this is a dumb thing to do, of course, but it can still be a fun exercise.
Open the rejection method experiment and select the Irwin-Hall distribution. For various values of \( n \), run the simulation 2000 times. Compare the empirical density function, mean, and standard deviation to their distributional counterparts.