12.1: Random Walks in Euclidean Space**
- Page ID
- 3178
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)In the last several chapters, we have studied sums of random variables with the goal being to describe the distribution and density functions of the sum. In this chapter, we shall look at sums of discrete random variables from a different perspective. We shall be concerned with properties which can be associated with the sequence of partial sums, such as the number of sign changes of this sequence, the number of terms in the sequence which equal 0, and the expected size of the maximum term in the sequence.
We begin with the following definition.
Let \(\{X_k\}_{k = 1}^\infty\) be a sequence of independent, identically distributed discrete random variables. For each positive integer \(n\), we let \(S_n\) denote the sum \(X_1 + X_2 + \cdots + X_n\). The sequence \(\{S_n\}_{n = 1}^\infty\) is called a If the common range of the \(X_k\)’s is \({\mathbf R}^m\), then we say that \(\{S_n\}\) is a random walk in \({\mathbf R}^m\).
We view the sequence of \(X_k\)’s as being the outcomes of independent experiments. Since the \(X_k\)’s are independent, the probability of any particular (finite) sequence of outcomes can be obtained by multiplying the probabilities that each \(X_k\) takes on the specified value in the sequence. Of course, these individual probabilities are given by the common distribution of the \(X_k\)’s. We will typically be interested in finding probabilities for events involving the related sequence of \(S_n\)’s. Such events can be described in terms of the \(X_k\)’s, so their probabilities can be calculated using the above idea.
There are several ways to visualize a random walk. One can imagine that a particle is placed at the origin in \({\mathbf R}^m\) at time \(n = 0\). The sum \(S_n\) represents the position of the particle at the end of \(n\) seconds. Thus, in the time interval \([n-1, n]\), the particle moves (or jumps) from position \(S_{n-1}\) to \(S_{n}\). The vector representing this motion is just \(S_n - S_{n-1}\), which equals \(X_n\). This means that in a random walk, the jumps are independent and identically distributed. If \(m = 1\), for example, then one can imagine a particle on the real line that starts at the origin, and at the end of each second, jumps one unit to the right or the left, with probabilities given by the distribution of the \(X_k\)’s. If \(m = 2\), one can visualize the process as taking place in a city in which the streets form square city blocks. A person starts at one corner (i.e., at an intersection of two streets) and goes in one of the four possible directions according to the distribution of the \(X_k\)’s. If \(m = 3\), one might imagine being in a jungle gym, where one is free to move in any one of six directions (left, right, forward, backward, up, and down). Once again, the probabilities of these movements are given by the distribution of the \(X_k\)’s.
Another model of a random walk (used mostly in the case where the range is \({\mathbf R}^1\)) is a game, involving two people, which consists of a sequence of independent, identically distributed moves. The sum \(S_n\) represents the score of the first person, say, after \(n\) moves, with the assumption that the score of the second person is \(-S_n\). For example, two people might be flipping coins, with a match or non-match representing \(+1\) or \(-1\), respectively, for the first player. Or, perhaps one coin is being flipped, with a head or tail representing \(+1\) or \(-1\), respectively, for the first player.
Random Walks on the Real Line
We shall first consider the simplest non-trivial case of a random walk in \({\mathbf R}^1\), namely the case where the common distribution function of the random variables \(X_n\) is given by \[f_X(x) = \left \{ \begin{array}{ll} 1/2, & \mbox{if $x = \pm 1,$} \\ 0, & \mbox{otherwise.} \end{array} \right.\] This situation corresponds to a fair coin being flipped, with \(S_n\) representing the number of heads minus the number of tails which occur in the first \(n\) flips. We note that in this situation, all paths of length \(n\) have the same probability, namely \(2^{-n}\).
It is sometimes instructive to represent a random walk as a polygonal line, or path, in the plane, where the horizontal axis represents time and the vertical axis represents the value of \(S_n\). Given a sequence \(\{S_n\}\) of partial sums, we first plot the points \((n, S_n)\), and then for each \(k < n\), we connect \((k, S_k)\) and \((k+1, S_{k+1})\) with a straight line segment. The length of a path is just the difference in the time values of the beginning and ending points on the path. The reader is referred to Figure [fig 12.1]. This figure, and the process it illustrates, are identical with the example, given in Chapter 1, of two people playing heads or tails.
Returns and First Returns
We say that an equalization has occurred, or there is a at time \(n\), if \(S_n = 0\). We note that this can only occur if \(n\) is an even integer. To calculate the probability of an equalization at time \(2m\), we need only count the number of paths of length \(2m\) which begin and end at the origin. The number of such paths is clearly
\[{2m \choose m}\ .\]
Since each path has probability \(2^{-2m}\), we have the following theorem.
The probability of a return to the origin at time \(2m\) is given by
\[u_{2m} = {2m \choose m}2^{-2m}\ .\]
The probability of a return to the origin at an odd time is 0.
A random walk is said to have a first return to the origin at time \(2m\) if \(m > 0\), and \(S_{2k} \ne 0\) for all \(k < m\). In Figure \(\PageIndex{1}\), the first return occurs at time 2. We define \(f_{2m}\) to be the probability of this event. (We also define \(f_0 = 0\).) One can think of the expression \(f_{2m}2^{2m}\) as the number of paths of length \(2m\) between the points \((0, 0)\) and \((2m, 0)\) that do not touch the horizontal axis except at the endpoints. Using this idea, it is easy to prove the following theorem.
For \(n \ge 1\), the probabilities \(\{u_{2k}\}\) and \(\{f_{2k}\}\) are related by the equation
\[u_{2n} = f_0 u_{2n} + f_2 u_{2n-2} + \cdots + f_{2n}u_0\ .\]
Proof. There are \(u_{2n}2^{2n}\) paths of length \(2n\) which have endpoints \((0, 0)\) and \((2n, 0)\). The collection of such paths can be partitioned into \(n\) sets, depending upon the time of the first return to the origin. A path in this collection which has a first return to the origin at time \(2k\) consists of an initial segment from \((0, 0)\) to \((2k, 0)\), in which no interior points are on the horizontal axis, and a terminal segment from \((2k, 0)\) to \((2n, 0)\), with no further restrictions on this segment. Thus, the number of paths in the collection which have a first return to the origin at time \(2k\) is given by
\[f_{2k}2^{2k}u_{2n-2k}2^{2n-2k} = f_{2k}u_{2n-2k}2^{2n}\ .\]
If we sum over \(k\), we obtain the equation
\[u_{2n}2^{2n} = f_0u_{2n} 2^{2n} + f_2u_{2n-2}2^{2n} + \cdots + f_{2n}u_0 2^{2n}\ .\]
Dividing both sides of this equation by \(2^{2n}\) completes the proof.
The expression in the right-hand side of the above theorem should remind the reader of a sum that appeared in Definition 7.1.1 of the convolution of two distributions. The convolution of two sequences is defined in a similar manner. The above theorem says that the sequence \(\{u_{2n}\}\) is the convolution of itself and the sequence \(\{f_{2n}\}\). Thus, if we represent each of these sequences by an ordinary generating function, then we can use the above relationship to determine the value \(f_{2n}\).
For \(m \ge 1\), the probability of a first return to the origin at time \(2m\) is given by
\[f_{2 m}=\frac{u_{2 m}}{2 m-1}=\frac{\left(\begin{array}{c} 2 m \\ m \end{array}\right)}{(2 m-1) 2^{2 m}}\]
Proof. We begin by defining the generating functions
\[U(x) = \sum_{m = 0}^\infty u_{2m}x^m\] and \[F(x) = \sum_{m = 0}^\infty f_{2m}x^m\ .\]
Theorem \(\PageIndex{2}\) says that
\[U(x) = 1 + U(x)F(x)\ . \label{eq 12.1.1}\]
(The presence of the 1 on the right-hand side is due to the fact that \(u_0\) is defined to be 1, but Theorem \(\PageIndex{1}\) only holds for \(m \ge 1\).) We note that both generating functions certainly converge on the interval \((-1, 1)\), since all of the coefficients are at most 1 in absolute value. Thus, we can solve the above equation for \(F(x)\), obtaining
\[F(x) = \dfrac{{U(x) - 1}{U(x)}\ .\]
Now, if we can find a closed-form expression for the function \(U(x)\), we will also have a closed-form expression for \(F(x)\). From Theorem \(\PageIndex{1}\), we have
\[ U(x)=\sum_{m=0}^{\infty}\left(\begin{array}{c}2 m \\ m\end{array}\right) 2^{-2 m} x^m\]
In Wilf,1 we find that
\[{1\over{\sqrt {1 - 4x}}} = \sum_{m = 0}^\infty {2m \choose m} x^m\ .\]
The reader is asked to prove this statement in Exercise \(\PageIndex{1}\). If we replace \(x\) by \(x/4\) in the last equation, we see that
\[U(x) = {1\over{\sqrt {1-x}}}\ .\]
Therefore, we have
\[\begin{aligned} F(x) & =\frac{U(x)-1}{U(x)} \\ & =\frac{(1-x)^{-1 / 2}-1}{(1-x)^{-1 / 2}} \\ & =1-(1-x)^{1 / 2} .\end{aligned}\]
Although it is possible to compute the value of \(f_{2m}\) using the Binomial Theorem, it is easier to note that \(F'(x) = U(x)/2\), so that the coefficients \(f_{2m}\) can be found by integrating the series for \(U(x)\). We obtain, for \(m \ge 1\),
\[\begin{aligned} f_{2 m} & =\frac{u_{2 m-2}}{2 m} \\ & =\frac{\left(\begin{array}{c}2 m-2 \\ m-1\end{array}\right)}{m 2^{2 m-1}} \\ & =\frac{\left(\begin{array}{c}2 m \\ m\end{array}\right)}{(2 m-1) 2^{2 m}} \\ & =\frac{u_{2 m}}{2 m-1},\end{aligned}\]
since
\[{2m-2 \choose m-1} = {m\over{2(2m-1)}}{2m\choose m}\ .\]
This completes the proof of the theorem.
Probability of Eventual Return
In the symmetric random walk process in \({\mathbf R}^m\), what is the probability that the particle eventually returns to the origin? We first examine this question in the case that \(m = 1\), and then we consider the general case. The results in the next two examples are due to Pólya.2
(Eventual Return in \({\mathbf R}^1\)) One has to approach the idea of eventual return with some care, since the sample space seems to be the set of all walks of infinite length, and this set is non-denumerable. To avoid difficulties, we will define \(w_n\) to be the probability that a first return has occurred no later than time \(n\). Thus, \(w_n\) concerns the sample space of all walks of length \(n\), which is a finite set. In terms of the \(w_n\)’s, it is reasonable to define the probability that the particle eventually returns to the origin to be
\[w_* = \lim_{n \rightarrow \infty} w_n\ .\]
This limit clearly exists and is at most one, since the sequence \(\{w_n\}_{n = 1}^\infty\) is an increasing sequence, and all of its terms are at most one.
In terms of the \(f_n\) probabilities, we see that
\[w_{2n} = \sum_{i = 1}^n f_{2i}\ .\]
Thus,
\[w_* = \sum_{i = 1}^\infty f_{2i}\ .\]
In the proof of Theorem \(\PageIndex{3}\), the generating function
\[F(x) = \sum_{m = 0}^\infty f_{2m}x^m\]
was introduced. There it was noted that this series converges for \(x \in (-1, 1)\). In fact, it is possible to show that this series also converges for \(x = \pm 1\) by using Exercise \(\PageIndex{4}\), together with the fact that
\[f_{2m} = \dfrac{u_{2m}}{2m-1} .\]
(This fact was proved in the proof of Theorem \(\PageIndex{3}\).) Since we also know that
\[F(x) = 1 - (1-x)^{1/2}\ ,\] we see that \[w_* = F(1) = 1\ .\]
Thus, with probability one, the particle returns to the origin.
An alternative proof of the fact that \(w_* = 1\) can be obtained by using the results in Exercise \(\PageIndex{12}\).
(Eventual Return in \({\mathbf R}^m\)) We now turn our attention to the case that the random walk takes place in more than one dimension. We define \(f^{(m)}_{2n}\) to be the probability that the first return to the origin in \({\mathbf R}^m\) occurs at time \(2n\). The quantity \(u^{(m)}_{2n}\) is defined in a similar manner. Thus, \(f^{(1)}_{2n}\) and \(u^{(1)}_{2n}\) equal \(f_{2n}\) and \(u_{2n}\), which were defined earlier. If, in addition, we define \(u^{(m)}_0 = 1\) and \(f^{(m)}_0 = 0\), then one can mimic the proof of Theorem \(\PageIndex{2}\), and show that for all \(m \ge 1\),
\[u^{(m)}_{2n} = f^{(m)}_0 u^{(m)}_{2n} + f^{(m)}_2 u^{(m)}_{2n-2} + \cdots + f^{(m)}_{2n}u^{(m)}_0\ . \label{eq 12.1.1.5}\]
We continue to generalize previous work by defining
\[U^{(m)}(x) = \sum_{n = 0}^\infty u^{(m)}_{2n} x^n\]
and
\[F^{(m)}(x) = \sum_{n = 0}^\infty f^{(m)}_{2n} x^n\ .\]
Then, by using Equation \(\PageIndex{2}\)], we see that
\[U^{(m)}(x) = 1 + U^{(m)}(x) F^{(m)}(x)\ ,\]
as before. These functions will always converge in the interval \((-1, 1)\), since all of their coefficients are at most one in magnitude. In fact, since
\[w^{(m)}_* = \sum_{n = 0}^\infty f^{(m)}_{2n} \le 1\]
for all \(m\), the series for \(F^{(m)}(x)\) converges at \(x = 1\) as well, and \(F^{(m)}(x)\) is left-continuous at \(x = 1\), i.e.,
\[\lim_{x \uparrow 1} F^{(m)}(x) = F^{(m)}(1)\ .\]
Thus, we have
\[w^{(m)}_* = \lim_{x \uparrow 1} F^{(m)}(x) = \lim_{x \uparrow 1} \frac{U^{(m)}(x) - 1}{U^{(m)}(x)}\ , \label{eq 12.1.1.6}\]
so to determine \(w^{(m)}_*\), it suffices to determine
\[\lim_{x \uparrow 1} U^{(m)}(x)\ .\] We let \(u^{(m)}\) denote this limit.
We claim that
\[u^{(m)} = \sum_{n = 0}^\infty u^{(m)}_{2n}\ .\]
(This claim is reasonable; it says that to find out what happens to the function \(U^{(m)}(x)\) at \(x = 1\), just let \(x = 1\) in the power series for \(U^{(m)}(x)\).) To prove the claim, we note that the coefficients \(u^{(m)}_{2n}\) are non-negative, so \(U^{(m)}(x)\) increases monotonically on the interval \([0, 1)\). Thus, for each \(K\), we have
\[\sum_{n = 0}^K u^{(m)}_{2n} \le \lim_{x \uparrow 1} U^{(m)}(x) = u^{(m)} \le \sum_{n = 0}^\infty u^{(m)}_{2n}\ .\]
By letting \(K \rightarrow \infty\), we see that \[u^{(m)} = \sum_{2n}^\infty u^{(m)}_{2n}\ .\]
This establishes the claim.
From Equation \(\PageIndex{3}\), we see that if \(u^{(m)} < \infty\), then the probability of an eventual return is
\[\frac {u^{(m)} - 1}{u^{(m)}}\ ,\]
while if \(u^{(m)} = \infty\), then the probability of eventual return is 1.
To complete the example, we must estimate the sum
\[\sum_{n = 0}^\infty u^{(m)}_{2n}\ .\]
In Exercise \(\PageIndex{12}\), the reader is asked to show that
\[u^{(2)}_{2n} = \frac 1 {4^{2n}} {{2n}\choose n}^2\ .\]
Using Stirling’s Formula, it is easy to show that (see Exercise \(\PageIndex{13}\))
\[\left(\begin{array}{c}2 n \\ n\end{array}\right) \sim \frac{2^{2 n}}{\sqrt{\pi n}} ,\]
so
\[u_{2 n}^{(2)} \sim \frac{1}{\pi n}\]
From this it follows easily that
\[\sum_{n = 0}^\infty u^{(2)}_{2n}\]
diverges, so \(w^{(2)}_* = 1\), i.e., in \({\mathbf R}^2\), the probability of an eventual return is 1.
When \(m = 3\), Exercise [\(\PageIndex{12}\)shows that
\[u^{(3)}_{2n} = \frac 1{2^{2n}}{{2n}\choose n} \sum_{j,k} \biggl(\frac 1{3^n}\frac{n!}{j!k!(n-j-k)!}\biggr)^2\ .\]
Let \(M\) denote the largest value of
\[\frac 1{3^n}\frac {n!}{j!k!(n - j - k)!}\ ,\]
over all non-negative values of \(j\) and \(k\) with \(j + k \le n\). It is easy, using Stirling’s Formula, to show that
\[M \sim \frac cn\ ,\]
for some constant \(c\). Thus, we have
\[u^{(3)}_{2n} \le \frac 1{2^{2n}}{{2n}\choose n} \sum_{j,k} \biggl(\frac M{3^n}\frac{n!}{j!k!(n-j-k)!}\biggr)\ .\]
Using Exercise \(\PageIndex{14}\), one can show that the right-hand expression is at most
\[\frac {c'}{n^{3/2}}\ ,\]
where \(c'\) is a constant. Thus,
\[\sum_{n = 0}^\infty u^{(3)}_{2n}\]
converges, so \(w^{(3)}_*\) is strictly less than one. This means that in \({\mathbf R}^3\), the probability of an eventual return to the origin is strictly less than one (in fact, it is approximately .34).
One may summarize these results by stating that one should not get drunk in more than two dimensions.
Expected Number of Equalizations
We now give another example of the use of generating functions to find a general formula for terms in a sequence, where the sequence is related by recursion relations to other sequences. Exercise [exer 12.1.9] gives still another example.
(Expected Number of Equalizations) In this example, we will derive a formula for the expected number of equalizations in a random walk of length \(2m\). As in the proof of Theorem \(\PageIndex{3}\), the method has four main parts. First, a recursion is found which relates the \(m\)th term in the unknown sequence to earlier terms in the same sequence and to terms in other (known) sequences. An example of such a recursion is given in Theorem \(\PageIndex{2}\). Second, the recursion is used to derive a functional equation involving the generating functions of the unknown sequence and one or more known sequences. Equation\(\PageIndex{1}\) is an example of such a functional equation. Third, the functional equation is solved for the unknown generating function. Last, using a device such as the Binomial Theorem, integration, or differentiation, a formula for the \(m\)th coefficient of the unknown generating function is found.
We begin by defining \(g_{2m}\) to be the number of equalizations among all of the random walks of length \(2m\). (For each random walk, we disregard the equalization at time 0.) We define \(g_0 = 0\). Since the number of walks of length \(2m\) equals \(2^{2m}\), the expected number of equalizations among all such random walks is \(g_{2m}/2^{2m}\). Next, we define the generating function \(G(x)\):
\[G(x) = \sum_{k = 0}^\infty g_{2k}x^k\ .\]
Now we need to find a recursion which relates the sequence \(\{g_{2k}\}\) to one or both of the known sequences \(\{f_{2k}\}\) and \(\{u_{2k}\}\). We consider \(m\) to be a fixed positive integer, and consider the set of all paths of length \(2m\) as the disjoint union
\[E_2 \cup E_4 \cup \cdots \cup E_{2m} \cup H\ ,\]
where \(E_{2k}\) is the set of all paths of length \(2m\) with first equalization at time \(2k\), and \(H\) is the set of all paths of length \(2m\) with no equalization. It is easy to show (see Exercise \(\PageIndex{3}\)) that
\[|E_{2k}| = f_{2k} 2^{2m}\ .\]
We claim that the number of equalizations among all paths belonging to the set \(E_{2k}\) is equal to
\[|E_{2k}| + 2^{2k} f_{2k} g_{2m - 2k}\ . \label{eq 12.1.2}\]
Each path in \(E_{2k}\) has one equalization at time \(2k\), so the total number of such equalizations is just \(|E_{2k}|\). This is the first summand in expression Equation \(\PageIndex{4}\). There are \(2^{2k} f_{2k}\) different initial segments of length \(2k\) among the paths in \(E_{2k}\). Each of these initial segments can be augmented to a path of length \(2m\) in \(2^{2m-2k}\) ways, by adjoining all possible paths of length \(2m - 2k\). The number of equalizations obtained by adjoining all of these paths to any one initial segment is \(g_{2m - 2k}\), by definition. This gives the second summand in Equation \(\PageIndex{4}\). Since \(k\) can range from 1 to \(m\), we obtain the recursion
\[g_{2m} = \sum_{k = 1}^m \Bigl(|E_{2k}| + 2^{2k}f_{2k}g_{2m - 2k}\Bigr)\ . \label{eq 12.1.3}\]
The second summand in the typical term above should remind the reader of a convolution. In fact, if we multiply the generating function \(G(x)\) by the generating function
\[F(4x) = \sum_{k = 0}^\infty 2^{2k}f_{2k} x^k\ ,\]
the coefficient of \(x^m\) equals
\[\sum_{k = 0}^m 2^{2k}f_{2k}g_{2m-2k}\ .\]
Thus, the product \(G(x)F(4x)\) is part of the functional equation that we are seeking. The first summand in the typical term in Equation \(\PageIndex{5}\) gives rise to the sum
\[2^{2m}\sum_{k = 1}^m f_{2k}\ .\]
From Exercise \(\PageIndex{2}\), we see that this sum is just \((1 - u_{2m})2^{2m}\). Thus, we need to create a generating function whose \(m\)th coefficient is this term; this generating function is
\[\sum_{m = 0}^\infty (1- u_{2m})2^{2m} x^m\ ,\]
or
\[\sum_{m = 0}^\infty 2^{2m} x^m - \sum_{m = 0}^\infty u_{2m}2^{2m} x^m\ .\]
The first sum is just \((1-4x)^{-1}\), and the second sum is \(U(4x)\). So, the functional equation which we have been seeking is
\[G(x) = F(4x)G(x) + {1\over{1-4x}} - U(4x)\ .\]
If we solve this recursion for \(G(x)\), and simplify, we obtain
\[G(x) = {1\over{(1-4x)^{3/2}}} - {1\over{(1-4x)}}\ . \label{eq 12.1.4}\]
We now need to find a formula for the coefficient of \(x^m\). The first summand in Equation \(\PageIndex{6}\) is \((1/2)U'(4x)\), so the coefficient of \(x^m\) in this function is
\[u_{2m+2} 2^{2m+1}(m+1)\ .\]
The second summand in Equation \(\PageIndex{6}\) is the sum of a geometric series with common ratio \(4x\), so the coefficient of \(x^m\) is \(2^{2m}\). Thus, we obtain
\[\begin{aligned} g_{2 m} & =u_{2 m+2} 2^{2 m+1}(m+1)-2^{2 m} \\ & =\frac{1}{2}\left(\begin{array}{c}2 m+2 \\ m+1\end{array}\right)(m+1)-2^{2 m}\end{aligned}\]
We recall that the quotient \(g_{2m}/2^{2m}\) is the expected number of equalizations among all paths of length \(2m\). Using Exercise \(\PageIndex{4}\), it is easy to show that
\[\frac{g_{2 m}}{2^{2 m}} \sim \sqrt{\frac{2}{\pi}} \sqrt{2 m}\]
In particular, this means that the average number of equalizations among all paths of length \(4m\) is not twice the average number of equalizations among all paths of length \(2m\). In order for the average number of equalizations to double, one must quadruple the lengths of the random walks.
It is interesting to note that if we define
\[M_n = \max_{0 \le k \le n} S_k\ ,\]
then we have
\[E(M_n) \sim \sqrt{2\over \pi}\sqrt n\ .\]
This means that the expected number of equalizations and the expected maximum value for random walks of length \(n\) are asymptotically equal as \(n \rightarrow \infty\). (In fact, it can be shown that the two expected values differ by at most \(1/2\) for all positive integers \(n\). See Exercise \(\PageIndex{9}\).)
Exercises
Exercise \(\PageIndex{1}\)
Using the Binomial Theorem, show that
\[{1\over{\sqrt {1 - 4x}}} = \sum_{m = 0}^\infty {2m \choose m} x^m\ .\]
What is the interval of convergence of this power series?
Exercise \(\PageIndex{2}\)
- Show that for \(m \ge 1\), \[f_{2m} = u_{2m-2} - u_{2m}\ .\]
- Using part (a), find a closed-form expression for the sum \[f_2 + f_4 + \cdots + f_{2m}\ .\]
- Using part (b), show that \[\sum_{m = 1}^\infty f_{2m} = 1\ .\] (One can also obtain this statement from the fact that \[F(x) = 1 - (1-x)^{1/2}\ .)\]
- Using parts (a) and (b), show that the probability of no equalization in the first \(2m\) outcomes equals the probability of an equalization at time \(2m\).
Exercise \(\PageIndex{3}\)
Using the notation of Example [exam 12.1.1], show that
\[|E_{2k}| = f_{2k} 2^{2m}\ .\]
Exercise \(\PageIndex{4}\)
Using Stirling’s Formula, show that
\[u_{2m} \sim {1\over{\sqrt {\pi m}}}\ .\]
Exercise \(\PageIndex{5}\)
A in a random walk occurs at time \(2k\) if \(S_{2k-1}\) and \(S_{2k+1}\) are of opposite sign.
- Give a rigorous argument which proves that among all walks of length \(2m\) that have an equalization at time \(2k\), exactly half have a lead change at time \(2k\).
- Deduce that the total number of lead changes among all walks of length \(2m\) equals \[{1\over 2}(g_{2m} - u_{2m})\ .\]
- Find an asymptotic expression for the average number of lead changes in a random walk of length \(2m\).
Exercise \(\PageIndex{6}\)
- Show that the probability that a random walk of length \(2m\) has a last return to the origin at time \(2k\), where \(0 \le k \le m\), equals \[(click for details)= u_{2k}u_{2m - 2k}\ .\] (The case \(k = 0\) consists of all paths that do not return to the origin at any positive time.) : A path whose last return to the origin occurs at time \(2k\) consists of two paths glued together, one path of which is of length \(2k\) and which begins and ends at the origin, and the other path of which is of length \(2m - 2k\) and which begins at the origin but never returns to the origin. Both types of paths can be counted using quantities which appear in this section.
Callstack: at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[7]/ol/li[1]/span, line 1, column 3
- Using part (a), show that if \(m\) is odd, the probability that a walk of length \(2m\) has no equalization in the last \(m\) outcomes is equal to \(1/2\), regardless of the value of \(m\). : The answer to part a) is symmetric in \(k\) and \(m-k\).
Exercise \(\PageIndex{7}\)
Show that the probability of no equalization in a walk of length \(2m\) equals \(u_{2m}\).
Exercise \(\PageIndex{8}\)
Show that \[P(S_1 \ge 0,\ S_2 \ge 0,\ \ldots,\ S_{2m} \ge 0) = u_{2m}\ .\] : First explain why \[\begin{aligned} &&P(S_1 > 0,\ S_2 > 0,\ \ldots,\ S_{2m} > 0) \\ && \;\;\;\;\;\;\;\;\;\;\;\;\; = {1\over 2}P(S_1 \ne 0,\ S_2 \ne 0,\ \ldots,\ S_{2m} \ne 0) \ .\end{aligned}\] Then use Exercise [exer 12.1.7], together with the observation that if no equalization occurs in the first \(2m\) outcomes, then the path goes through the point \((1,1)\) and remains on or above the horizontal line \(x = 1\).
Exercise \(\PageIndex{9}\)
In Feller,3 one finds the following theorem: Let \(M_n\) be the random variable which gives the maximum value of \(S_k\), for \(1 \le k \le n\). Define
\[p_{n, r} = {n\choose
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[10]/p[2]/span, line 1, column 4
then
\[P(M_n = r) = \left \{ \begin{array}{ll} p_{n, r}\,,&\mbox{if $r \equiv n\, (\mbox{mod}\ 2)$}, \\ p_{n, r+1}\,,&\mbox{if $r \not\equiv n\,(\mbox{mod}\ 2)$}. \end{array} \right.\]
- Using this theorem, show that \[E(M_{2m}) = {1\over{2^{2m}}}\sum_{k = 1}^m (4k-1){2m \choose m+k}\ ,\] and if \(n = 2m+1\), then \[E(M_{2m+1}) = {1\over {2^{2m+1}}} \sum_{k = 0}^m (4k+1){2m+1\choose m+k+1}\ .\]
- For \(m \ge 1\), define \[r_m = \sum_{k = 1}^m k {2m\choose m+k}\] and \[s_m = \sum_{k = 1}^m k {2m+1\choose m+k+1}\ .\] By using the identity \[{n\choose k} = {n-1\choose k-1} + {n-1\choose k}\ ,\] show that \[s_m = 2r_m - {1\over 2}\biggl(2^{2m} - {2m \choose m}\biggr)\] and \[r_m = 2s_{m-1} + {1\over 2}2^{2m-1}\ ,\] if \(m \ge 2\).
- Define the generating functions \[R(x) = \sum_{k = 1}^\infty r_k x^k\] and \[S(x) = \sum_{k = 1}^\infty s_k x^k\ .\] Show that \[S(x) = 2 R(x) - {1\over 2}\biggl({1\over{1- 4x}}\biggr) + {1\over 2}\biggl(\sqrt{1-4x}\biggr)\] and \[R(x) = 2xS(x) + x\biggl({1\over{1-4x}}\biggr)\ .\]
- Show that \[R(x) = {x\over{(1-4x)^{3/2}}}\ ,\] and \[S(x) = {1\over 2}\biggl({1\over{(1- 4x)^{3/2}}}\biggr) - {1\over 2}\biggl({1\over{1- 4x}}\biggr)\ .\]
- Show that \[r_m = m{2m-1\choose m-1}\ ,\] and \[s_m = {1\over 2}(m+1){2m+1\choose m} - {1\over 2}(2^{2m})\ .\]
- Show that \[E(M_{2m}) = {m\over{2^{2m-1}}}{2m\choose m} + {1\over{2^{2m+1}}}{2m\choose m} - {1\over 2}\ ,\] and \[E(M_{2m+1}) = (click for details){2m+2\choose m+1} - {1\over 2}\ .\] The reader should compare these formulas with the expression for \(g_{2m}/2^{(2m)}\) in Example [exam 12.1.1].
Callstack: at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[10]/ol/li[6]/span, line 1, column 4
Exercise \(\PageIndex{10}\)
(from K. Levasseur4) A parent and his child play the following game. A deck of \(2n\) cards, \(n\) red and \(n\) black, is shuffled. The cards are turned up one at a time. Before each card is turned up, the parent and the child guess whether it will be red or black. Whoever makes more correct guesses wins the game. The child is assumed to guess each color with the same probability, so she will have a score of \(n\), on average. The parent keeps track of how many cards of each color have already been turned up. If more black cards, say, than red cards remain in the deck, then the parent will guess black, while if an equal number of each color remain, then the parent guesses each color with probability 1/2. What is the expected number of correct guesses that will be made by the parent? : Each of the \({{2n}\choose n}\) possible orderings of red and black cards corresponds to a random walk of length \(2n\) that returns to the origin at time \(2n\). Show that between each pair of successive equalizations, the parent will be right exactly once more than he will be wrong. Explain why this means that the average number of correct guesses by the parent is greater than \(n\) by exactly one-half the average number of equalizations. Now define the random variable \(X_i\) to be 1 if there is an equalization at time \(2i\), and 0 otherwise. Then, among all relevant paths, we have
\[E(X_i) = P(X_i = 1) = \frac
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[11]/p[2]/span[1], line 1, column 3
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[11]/p[2]/span[2], line 1, column 3
Thus, the expected number of equalizations equals
\[E\biggl(\sum_{i = 1}^n X_i\biggr) = \frac 1
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[11]/p[4]/span[1], line 1, column 3
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[11]/p[4]/span[2], line 1, column 2
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[11]/p[4]/span[3], line 1, column 2
One can now use generating functions to find the value of the sum.
It should be noted that in a game such as this, a more interesting question than the one asked above is what is the probability that the parent wins the game? For this game, this question was answered by D. Zagier.5 He showed that the probability of winning is asymptotic (for large \(n\)) to the quantity \[\frac 12 + \frac 1{2\sqrt 2}\ .\]
Exercise \(\PageIndex{11}\)
Prove that
\[u^{(2)}_{2n} = \frac 1{4^{2n}} \sum_{k = 0}^n \frac {(2n)!}{k!k!(n-k)!(n-k)!}\ ,\] and \[u^{(3)}_{2n} = \frac 1{6^{2n}} \sum_{j,k} \frac {(2n)!}{j!j!k!k!(n-j-k)!(n-j-k)!}\ ,\]
where the last sum extends over all non-negative \(j\) and \(k\) with \(j+k \le n\). Also show that this last expression may be rewritten as
\[\frac 1{2^{2n}}{{2n}\choose n} \sum_{j,k} \biggl(\frac 1{3^n}\frac{n!}{j!k!(n-j-k)!}\biggr)^2\ .\]
Exercise \(\PageIndex{12}\)
Prove that if \(n \ge 0\), then
\[\sum_{k = 0}^n {n \choose k}^2 = {{2n} \choose n}\ .\]
Write the sum as
\[\sum_{k = 0}^n {n \choose k}{n \choose {n-k}}\]
and explain why this is a coefficient in the product
\[(1 + x)^n (1 + x)^n\ .\]
Use this, together with Exercise [exer 12.1.11], to show that
\[u^{(2)}_{2n} = \frac 1{4^{2n}}
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[13]/p[8]/span, line 1, column 2
Exercise \(\PageIndex{13}\)
Using Stirling’s Formula, prove that
\[
Callstack:
at (Bookshelves/Probability_Theory/Introductory_Probability_(Grinstead_and_Snell)/12:_Random_Walks/12.01:_Random_Walks_in_Euclidean_Space), /content/body/div[4]/div[14]/p[2]/span, line 1, column 2
Exercise \(\PageIndex{14}\)
Prove that
\[\sum_{j,k} \biggl(\frac 1{3^n}\frac{n!}{j!k!(n-j-k)!}\biggr) = 1\ ,\]
where the sum extends over all non-negative \(j\) and \(k\) such that \(j + k \le n\). : Count how many ways one can place \(n\) labelled balls in 3 labelled urns.
Exercise \(\PageIndex{15}\)
Using the result proved for the random walk in \({\mathbf R}^3\) in Example [exam 12.1.0.6], explain why the probability of an eventual return in \({\mathbf R}^n\) is strictly less than one, for all \(n \ge 3\). : Consider a random walk in \({\mathbf R}^n\) and disregard all but the first three coordinates of the particle’s position.