7.1: Sums of Discrete Random Variables

Last updated
Save as PDF

Page ID: 3149

Charles M. Grinstead & J. Laurie Snell
Swarthmore College and Dartmouth College via American Mathematical Society

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

In this chapter we turn to the important question of determining the distribution of a sum of independent random variables in terms of the distributions of the individual constituents. In this section we consider only sums of discrete random variables, reserving the case of continuous random variables for the next section.

We consider here only random variables whose values are integers. Their distribution functions are then defined on these integers. We shall find it convenient to assume here that these distribution functions are defined for all integers, by defining them to be 0 where they are not otherwise defined.

Convolutions

Suppose \(X\) and \(Y\) are two independent discrete random variables with distribution functions \(m_{1}(x)\) and \(m_{2}(x)\). Let \(Z=X+Y\). We would like to determine the distribution function \(m_{3}(x)\) of \(Z\). To do this, it is enough to determine the probability that \(Z\) takes on the value \(z\), where \(z\) is an arbitrary integer. Suppose that \(X=k\), where \(k\) is some integer. Then \(Z=z\) if and only if \(Y=z-k\). So the event \(Z=z\) is the union of the pairwise disjoint events

\[
(X=k) \text { and }(Y=z-k),
\]

where \(k\) runs over the integers. Since these events are pairwise disjoint, we have

\[
P(Z=z)=\sum_{k=-\infty}^{\infty} P(X=k) \cdot P(Y=z-k)
\]

Thus, we have found the distribution function of the random variable \(Z\). This leads to the following definition.

Definition: Term

Let \(X\) and \(Y\) be two independent integer-valued random variables, with distribution functions \(m_{1}(x)\) and \(m_{2}(x)\) respectively. Then the convolution of \(m_{1}(x)\) and \(m_{2}(x)\) is the distribution function \(m_{3}=m_{1} * m_{2}\) given by

\[
m_{3}(j)=\sum_{k} m_{1}(k) \cdot m_{2}(j-k)
\]

for \(j=\ldots,-2,-1,0,1,2, \ldots\). The function \(m_{3}(x)\) is the distribution function of the random variable \(Z=X+Y\).

It is easy to see that the convolution operation is commutative, and it is straightforward to show that it is also associative.

Now let \(S_{n}=X_{1}+X_{2}+\cdots+X_{n}\) be the sum of \(n\) independent random variables of an independent trials process with common distribution function \(m\) defined on the integers. Then the distribution function of \(S_{1}\) is \(m\). We can write

\[
S_{n}=S_{n-1}+X_{n} .
\]

Thus, since we know the distribution function of \(X_{n}\) is \(m\), we can find the distribution function of \(S_{n}\) by induction.

Example \(\PageIndex{1}\)

A die is rolled twice. Let \(X_{1}\) and \(X_{2}\) be the outcomes, and let \(S_{2}=X_{1}+X_{2}\) be the sum of these outcomes. Then \(X_{1}\) and \(X_{2}\) have the common distribution function:

\[
m=\left(\begin{array}{cccccc}
1 & 2 & 3 & 4 & 5 & 6 \\
1 / 6 & 1 / 6 & 1 / 6 & 1 / 6 & 1 / 6 & 1 / 6
\end{array}\right) .
\]

The distribution function of \(S_{2}\) is then the convolution of this distribution with itself. Thus,

\[
\begin{aligned}
P\left(S_{2}=2\right) & =m(1) m(1) \\
& =\frac{1}{6} \cdot \frac{1}{6}=\frac{1}{36}, \\
P\left(S_{2}=3\right) & =m(1) m(2)+m(2) m(1) \\
& =\frac{1}{6} \cdot \frac{1}{6}+\frac{1}{6} \cdot \frac{1}{6}=\frac{2}{36}, \\
P\left(S_{2}=4\right) & =m(1) m(3)+m(2) m(2)+m(3) m(1) \\
& =\frac{1}{6} \cdot \frac{1}{6}+\frac{1}{6} \cdot \frac{1}{6}+\frac{1}{6} \cdot \frac{1}{6}=\frac{3}{36} .
\end{aligned}
\]

Continuing in this way we would find \(P\left(S_{2}=5\right)=4 / 36, P\left(S_{2}=6\right)=5 / 36\), \(P\left(S_{2}=7\right)=6 / 36, P\left(S_{2}=8\right)=5 / 36, P\left(S_{2}=9\right)=4 / 36, P\left(S_{2}=10\right)=3 / 36\), \(P\left(S_{2}=11\right)=2 / 36\), and \(P\left(S_{2}=12\right)=1 / 36\).

The distribution for \(S_{3}\) would then be the convolution of the distribution for \(S_{2}\) with the distribution for \(X_{3}\). Thus

\[
P\left(S_{3}=3\right)=P\left(S_{2}=2\right) P\left(X_{3}=1\right)
\]

\[
\begin{aligned}
& =\frac{1}{36} \cdot \frac{1}{6}=\frac{1}{216}, \\
P\left(S_{3}=4\right) & =P\left(S_{2}=3\right) P\left(X_{3}=1\right)+P\left(S_{2}=2\right) P\left(X_{3}=2\right) \\
& =\frac{2}{36} \cdot \frac{1}{6}+\frac{1}{36} \cdot \frac{1}{6}=\frac{3}{216},
\end{aligned}
\]

and so forth.

This is clearly a tedious job, and a program should be written to carry out this calculation. To do this we first write a program to form the convolution of two densities \(p\) and \(q\) and return the density \(r\). We can then write a program to find the density for the sum \(S_{n}\) of \(n\) independent random variables with a common density \(p\), at least in the case that the random variables have a finite number of possible values.

Running this program for the example of rolling a die \(n\) times for \(n=10,20,30\) results in the distributions shown in Figure 7.1. We see that, as in the case of Bernoulli trials, the distributions become bell-shaped. We shall discuss in Chapter 9 a very general theorem called the Central Limit Theorem that will explain this phenomenon.

Example \(\PageIndex{1}\)

A well-known method for evaluating a bridge hand is: an ace is assigned a value of 4 , a king 3 , a queen 2 , and a jack 1 . All other cards are assigned a value of 0 . The point count of the hand is then the sum of the values of the cards in the hand. (It is actually more complicated than this, taking into account voids in suits, and so forth, but we consider here this simplified form of the point count.) If a card is dealt at random to a player, then the point count for this card has distribution

\[
p_{X}=\left(\begin{array}{ccccc}
0 & 1 & 2 & 3 & 4 \\
36 / 52 & 4 / 52 & 4 / 52 & 4 / 52 & 4 / 52
\end{array}\right)
\]

Let us regard the total hand of 13 cards as 13 independent trials with this common distribution. (Again this is not quite correct because we assume here that we are always choosing a card from a full deck.) Then the distribution for the point count \(C\) for the hand can be found from the program NFoldConvolution by using the distribution for a single card and choosing \(n=13\). A player with a point count of 13 or more is said to have an opening bid. The probability of having an opening bid is then

\[
P(C \geq 13)
\]

Since we have the distribution of \(C\), it is easy to compute this probability. Doing this we find that

\[
P(C \geq 13)=.2845
\]

so that about one in four hands should be an opening bid according to this simplified model. A more realistic discussion of this problem can be found in Epstein, The Theory of Gambling and Statistical Logic. \({ }^{1}\)

\({ }^{1}\) R. A. Epstein, The Theory of Gambling and Statistical Logic, rev. ed. (New York: Academic Press, 1977).

Figure \(\PageIndex{1}\): Density of \(S_{n}\) for rolling a die \(n\) times.

For certain special distributions it is possible to find an expression for the distribution that results from convoluting the distribution with itself \(n\) times.

The convolution of two binomial distributions, one with parameters \(m\) and \(p\) and the other with parameters \(n\) and \(p\), is a binomial distribution with parameters \((m+n)\) and \(p\). This fact follows easily from a consideration of the experiment which consists of first tossing a coin \(m\) times, and then tossing it \(n\) more times.

The convolution of \(k\) geometric distributions with common parameter \(p\) is a negative binomial distribution with parameters \(p\) and \(k\). This can be seen by considering the experiment which consists of tossing a coin until the \(k\) th head appears.

Exercises

Exercise \(\PageIndex{1}\): A die is rolled three times. Find the probability that the sum of the outcomes is

(a) greater than 9.

(b) an odd number.

Exercise \(\PageIndex{2}\): The price of a stock on a given trading day changes according to the distribution

\[
p_{X}=\left(\begin{array}{cccc}
-1 & 0 & 1 & 2 \\
1 / 4 & 1 / 2 & 1 / 8 & 1 / 8
\end{array}\right) .
\]

Find the distribution for the change in stock price after two (independent) trading days.

Exercise \(\PageIndex{3}\): Let \(X_{1}\) and \(X_{2}\) be independent random variables with common distribution

\[
p_{X}=\left(\begin{array}{ccc}
0 & 1 & 2 \\
1 / 8 & 3 / 8 & 1 / 2
\end{array}\right)
\]

Find the distribution of the sum \(X_{1}+X_{2}\).

Exercise \(\PageIndex{4}\): In one play of a certain game you win an amount \(X\) with distribution

\[
p_{X}=\left(\begin{array}{ccc}
1 & 2 & 3 \\
1 / 4 & 1 / 4 & 1 / 2
\end{array}\right)
\]

Using the program NFoldConvolution find the distribution for your total winnings after ten (independent) plays. Plot this distribution.

Exercise \(\PageIndex{5}\): Consider the following two experiments: the first has outcome \(X\) taking on the values 0,1 , and 2 with equal probabilities; the second results in an (independent) outcome \(Y\) taking on the value 3 with probability \(1 / 4\) and 4 with probability \(3 / 4\). Find the distribution of
(a) \(Y+X\).
(b) \(Y-X\).

Exercise \(\PageIndex{6}\): People arrive at a queue according to the following scheme: During each minute of time either 0 or 1 person arrives. The probability that 1 person arrives is \(p\) and that no person arrives is \(q=1-p\). Let \(C_{r}\) be the number of customers arriving in the first \(r\) minutes. Consider a Bernoulli trials process with a success if a person arrives in a unit time and failure if no person arrives in a unit time. Let \(T_{r}\) be the number of failures before the \(r\) th success.

(a) What is the distribution for \(T_{r}\) ?

(b) What is the distribution for \(C_{r}\) ?

Exercise \(\PageIndex{7}\): (a) A die is rolled three times with outcomes \(X_{1}, X_{2}\), and \(X_{3}\). Let \(Y_{3}\) be the maximum of the values obtained. Show that

\[
P\left(Y_{3} \leq j\right)=P\left(X_{1} \leq j\right)^{3} .
\]

Use this to find the distribution of \(Y_{3}\). Does \(Y_{3}\) have a bell-shaped distribution?

(b) Now let \(Y_{n}\) be the maximum value when \(n\) dice are rolled. Find the distribution of \(Y_{n}\). Is this distribution bell-shaped for large values of \(n\) ?

Exercise \(\PageIndex{8}\): A baseball player is to play in the World Series. Based upon his season play, you estimate that if he comes to bat four times in a game the number of hits he will get has a distribution

\[
p_{X}=\left(\begin{array}{ccccc}
0 & 1 & 2 & 3 & 4 \\
.4 & .2 & .2 & .1 & .1
\end{array}\right)
\]

Assume that the player comes to bat four times in each game of the series.

(a) Let \(X\) denote the number of hits that he gets in a series. Using the program NFoldConvolution, find the distribution of \(X\) for each of the possible series lengths: four-game, five-game, six-game, seven-game.

(b) Using one of the distribution found in part (a), find the probability that his batting average exceeds .400 in a four-game series. (The batting average is the number of hits divided by the number of times at bat.)

Exercise \(\PageIndex{9}\): Prove that you cannot load two dice in such a way that the probabilities for any sum from 2 to 12 are the same. (Be sure to consider the case where one or more sides turn up with probability zero.)

Exercise \(\PageIndex{10}\):\(10\left(\right.\) Lévy \(\left.^{2}\right)\) Assume that \(n\) is an integer, not prime. Show that you can find two distributions \(a\) and \(b\) on the nonnegative integers such that the convolution of

\({ }^{2}\) See M. Krasner and B. Ranulae, "Sur une Proprieté des Polynomes de la Division du Circle"; and the following note by J. Hadamard, in C. R. Acad. Sci., vol. 204 (1937), pp. 397-399. \(a\) and \(b\) is the equiprobable distribution on the set \(0,1,2, \ldots, n-1\). If \(n\) is prime this is not possible, but the proof is not so easy. (Assume that neither \(a\) nor \(b\) is concentrated at 0 .)

Exercise \(\PageIndex{11}\): Assume that you are playing craps with dice that are loaded in the following way: faces two, three, four, and five all come up with the same probability \((1 / 6)+r\). Faces one and six come up with probability \((1 / 6)-2 r\), with \(0<\) \(r<.02\). Write a computer program to find the probability of winning at craps with these dice, and using your program find which values of \(r\) make craps a favorable game for the player with these dice.

Search

Text Color

Text Size

Margin Size

Font Type

Definition: Term

Example \(\PageIndex{1}\)

Example \(\PageIndex{1}\)