6.2: Variance of Discrete Random Variables
 Page ID
 3145
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{\!\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\ #1 \}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\ #1 \}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{\!\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{\!\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left#1\right}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)The usefulness of the expected value as a prediction for the outcome of an experiment is increased when the outcome is not likely to deviate too much from the expected value. In this section we shall introduce a measure of this deviation, called the variance.
Variance
Let \(X\) be a numerically valued random variable with expected value \(\mu = E(X)\). Then the variance of \(X\), denoted by \(V(X)\), is \[V(X) = E((X  \mu)^2)\ .\]
Note that, by Theorem 6.1.1, \(V(X)\) is given by
\[V(X) = \sum_x (x  \mu)^2 m(x)\ , \label{eq 6.1}\] where \(m\) is the distribution function of \(X\).
Standard Deviation
The standard deviation of \(X\), denoted by \(D(X)\), is \(D(X) = \sqrt {V(X)}\). We often write \(\sigma\) for \(D(X)\) and \(\sigma^2\) for \(V(X)\).
Consider one roll of a die. Let \(X\) be the number that turns up. To find \(V(X)\), we must first find the expected value of \(X\).
Solution
This is
\[\begin{align} \mu & = & E(X) = 1\Bigl(\frac 16\Bigr) + 2\Bigl(\frac 16\Bigr) + 3\Bigl(\frac{1}{6}\Bigr) + 4\Bigl(\frac{1}{6}\Bigr) + 5\Bigl(\frac{1}{6}\Bigr) + 6\Bigl(\frac{1}{6}\Bigr) \\ & = & \frac{7}{2} .\end{align}\]
To find the variance of \(X\), we form the new random variable \((X  \mu)^2\) and compute its expectation. We can easily do this using the following table.
X 
m(x) 
(*  7/2Y 

1 
1/6 
25/4 
2 
1/6 
9/4 
3 
1/6 
1/4 
4 
1/6 
1/4 
5 
1/6 
9/4 
6 
1/6 
25/4 
From this table we find \(E((X  \mu)^2)\) is \[\begin{align} V(X) & = & \frac{1}{6} \left( \frac{25}{4} + \frac{9}{4} + \frac{1}{4} + \frac{1}{4} + \frac{9}{4} + \frac {25}{4} \right) \\ & = &\frac{35}{12} \end{align}\]
and the standard deviation \(D(X) = \sqrt{35/12} \approx 1.707\).
Calculation of Variance
We next prove a theorem that gives us a useful alternative form for computing the variance.
If \(X\) is any random variable with \(E(X) = \mu\), then \[V(X) = E(X^2)  \mu^2\ .\]
 Proof

We have \[\begin{aligned} V(X) & = & E((X  \mu)^2) = E(X^2  2\mu X + \mu^2) \\ & = & E(X^2)  2\mu E(X) + \mu^2 = E(X^2)  \mu^2\ .\end{aligned}\]
Using Theorem \(\PageIndex{1\), we can compute the variance of the outcome of a roll of a die by first computing \[\begin{align} E(X^2) & = & 1\Bigl(\frac 16\Bigr) + 4\Bigl(\frac 16\Bigr) + 9\Bigl(\frac 16\Bigr) + 16\Bigl(\frac 16\Bigr) + 25\Bigl(\frac 16\Bigr) + 36\Bigl(\frac 16\Bigr) \\ & = &\frac {91}6\ ,\end{align}\] and, \[V(X) = E(X^2)  \mu^2 = \frac {91}{6}  \Bigl(\frac 72\Bigr)^2 = \frac {35}{12}\ ,\] in agreement with the value obtained directly from the definition of \(V(X)\).
Properties of Variance
The variance has properties very different from those of the expectation. If \(c\) is any constant, \(E(cX) = cE(X)\) and \(E(X + c) = E(X) + c\). These two statements imply that the expectation is a linear function. However, the variance is not linear, as seen in the next theorem.
If \(X\) is any random variable and \(c\) is any constant, then \[V(cX) = c^2 V(X)\] and \[V(X + c) = V(X)\ .\]
 Proof

Let \(\mu = E(X)\). Then \(E(cX) = c\mu\), and \[\begin{aligned} V(cX) &=& E((cX  c\mu)^2) = E(c^2(X  \mu)^2) \\ &=& c^2 E((X  \mu)^2) = c^2 V(X)\ .\end{aligned}\]
To prove the second assertion, we note that, to compute \(V(X + c)\), we would replace \(x\) by \(x + c\) and \(\mu\) by \(\mu + c\) in Equation [eq 6.1]. Then the \(c\)’s would cancel, leaving \(V(X)\).
We turn now to some general properties of the variance. Recall that if \(X\) and \(Y\) are any two random variables, \(E(X + Y) = E(X) + E(Y)\). This is not always true for the case of the variance. For example, let \(X\) be a random variable with \(V(X) \ne 0\), and define \(Y = X\). Then \(V(X) = V(Y)\), so that \(V(X) + V(Y) = 2V(X)\). But \(X + Y\) is always 0 and hence has variance 0. Thus \(V(X + Y) \ne V(X) + V(Y)\).
In the important case of mutually independent random variables, however, the variance of the sum is the sum of the variances.
Let \(X\) and \(Y\) be two random variables. Then \[V(X + Y) = V(X) + V(Y)\ .\]
 Proof

Let \(E(X) = a\) and \(E(Y) = b\). Then \[\begin{aligned} V(X + Y) & = & E((X + Y)^2)  (a + b)^2 \\ & = & E(X^2) + 2E(XY) + E(Y^2)  a^2  2ab  b^2\ .\end{aligned}\] Since \(X\) and \(Y\) are independent, \(E(XY) = E(X)E(Y) = ab\). Thus, \[V(X + Y) = E(X^2)  a^2 + E(Y^2)  b^2 = V(X) + V(Y)\ .\]
It is easy to extend this proof, by mathematical induction, to show that the variance of the sum of any number of mutually independent random variables is the sum of the individual variances. Thus we have the following theorem.
Let \(X_1\), \(X_2\), …, \(X_n\) be an independent trials process with \(E(X_j) = \mu\) and \(V(X_j) = \sigma^2\). Let \[S_n = X_1 + X_2 +\cdots+ X_n\] be the sum, and \[A_n = \frac {S_n}n\] be the average. Then \[\begin{aligned} E(S_n) &=& n\mu\ , \\ V(S_n) &=& n\sigma^2\ , \\ \sigma(S_n) &=& \sigma \sqrt{n}\ , \\ E(A_n) &=& \mu\ , \\ V(A_n) &=& \frac {\sigma^2}\ , \\ \sigma(A_n) &=& \frac{\sigma}{\sqrt n}\ .\end{aligned}\]
 Proof

Since all the random variables \(X_j\) have the same expected value, we have \[E(S_n) = E(X_1) +\cdots+ E(X_n) = n\mu\ ,\] \[V(S_n) = V(X_1) +\cdots+ V(X_n) = n\sigma^2\ ,\] and \[\sigma(S_n) = \sigma \sqrt{n}\ .\]
We have seen that, if we multiply a random variable \(X\) with mean \(\mu\) and variance \(\sigma^2\) by a constant \(c\), the new random variable has expected value \(c\mu\) and variance \(c^2\sigma^2\). Thus,
\[E(A_n) = E\left(\frac {S_n}n \right) = \frac {n\mu}n = \mu\ ,\]
and
\[V(A_n) = V\left( \frac {S_n}n \right) = \frac {V(S_n)}{n^2} = \frac {n\sigma^2}{n^2} = \frac {\sigma^2}n\ .\]
Finally, the standard deviation of \(A_n\) is given by
\[\sigma(A_n) = \frac {\sigma}{\sqrt n}\ .\]
The last equation in the above theorem implies that in an independent trials process, if the individual summands have finite variance, then the standard deviation of the average goes to 0 as \(n \rightarrow \infty\). Since the standard deviation tells us something about the spread of the distribution around the mean, we see that for large values of \(n\), the value of \(A_n\) is usually very close to the mean of \(A_n\), which equals \(\mu\), as shown above. This statement is made precise in Chapter 8 where it is called the Law of Large Numbers. For example, let \(X\) represent the roll of a fair die. In Figure [fig 6.4.5], we show the distribution of a random variable \(A_n\) corresponding to \(X\), for \(n = 10\) and \(n = 100\).
Consider \(n\) rolls of a die. We have seen that, if \(X_j\) is the outcome if the \(j\)th roll, then \(E(X_j) = 7/2\) and \(V(X_j) = 35/12\). Thus, if \(S_n\) is the sum of the outcomes, and \(A_n = S_n/n\) is the average of the outcomes, we have \(E(A_n) = 7/2\) and \(V(A_n) = (35/12)/n\). Therefore, as \(n\) increases, the expected value of the average remains constant, but the variance tends to 0. If the variance is a measure of the expected deviation from the mean this would indicate that, for large \(n\), we can expect the average to be very near the expected value. This is in fact the case, and we shall justify it in Chapter 8 .
Bernoulli Trials
Consider next the general Bernoulli trials process. As usual, we let \(X_j = 1\) if the \(j\)th outcome is a success and 0 if it is a failure. If \(p\) is the probability of a success, and \(q = 1  p\), then \[\begin{aligned} E(X_j) & = & 0q + 1p = p\ , \\ E(X_j^2) & = & 0^2q + 1^2p = p\ ,\end{aligned}\] and \[V(X_j) = E(X_j^2)  (E(X_j))^2 = p  p^2 = pq\ .\]
Thus, for Bernoulli trials, if \(S_n = X_1 + X_2 +\cdots+ X_n\) is the number of successes, then \(E(S_n) = np\), \(V(S_n) = npq\), and \(D(S_n) = \sqrt{npq}.\) If \(A_n = S_n/n\) is the average number of successes, then \(E(A_n) = p\), \(V(A_n) = pq/n\), and \(D(A_n) = \sqrt{pq/n}\). We see that the expected proportion of successes remains \(p\) and the variance tends to 0. This suggests that the frequency interpretation of probability is a correct one. We shall make this more precise in Chapter 8 .
Let \(T\) denote the number of trials until the first success in a Bernoulli trials process. Then \(T\) is geometrically distributed. What is the variance of \(T\)?
 Answer

In Example [exam 5.7], we saw that \[m_T = \pmatrix{1 & 2 & 3 & \cdots \cr p & qp & q^2p & \cdots \cr}.\] In Example [exam 6.8], we showed that \[E(T) = 1/p\ .\] Thus, \[V(T) = E(T^2)  1/p^2\ ,\] so we need only find \[\begin{aligned} E(T^2) & = & 1p + 4qp + 9q^2p + \cdots \\ & = & p(1 + 4q + 9q^2 + \cdots )\ .\end{aligned}\] To evaluate this sum, we start again with \[1 + x + x^2 +\cdots= \frac 1{1  x}\ .\] Differentiating, we obtain \[1 + 2x + 3x^2 +\cdots= \frac 1{(1  x)^2}\ .\] Multiplying by \(x\), \[x + 2x^2 + 3x^3 +\cdots= \frac x{(1  x)^2}\ .\] Differentiating again gives \[1 + 4x + 9x^2 +\cdots= \frac {1 + x}{(1  x)^3}\ .\] Thus, \[E(T^2) = p\frac {1 + q}{(1  q)^3} = \frac {1 + q}{p^2}\] and \[\begin{aligned} V(T) & = & E(T^2)  (E(T))^2 \\ & = & \frac {1 + q}{p^2}  \frac 1{p^2} = \frac q{p^2}\ .\end{aligned}\]
For example, the variance for the number of tosses of a coin until the first head turns up is \((1/2)/(1/2)^2 = 2\). The variance for the number of rolls of a die until the first six turns up is \((5/6)/(1/6)^2 = 30\). Note that, as \(p\) decreases, the variance increases rapidly. This corresponds to the increased spread of the geometric distribution as \(p\) decreases (noted in Figure [fig 5.4]).
Poisson Distribution
Just as in the case of expected values, it is easy to guess the variance of the Poisson distribution with parameter \(\lambda\). We recall that the variance of a binomial distribution with parameters \(n\) and \(p\) equals \(npq\). We also recall that the Poisson distribution could be obtained as a limit of binomial distributions, if \(n\) goes to \(\infty\) and \(p\) goes to 0 in such a way that their product is kept fixed at the value \(\lambda\). In this case, \(npq = \lambda q\) approaches \(\lambda\), since \(q\) goes to 1. So, given a Poisson distribution with parameter \(\lambda\), we should guess that its variance is \(\lambda\). The reader is asked to show this in Exercise \(\PageIndex{29}\)
Exercises
Exercise \(\PageIndex{1}\)
A number is chosen at random from the set \(S = \{1,0,1\}\). Let \(X\) be the number chosen. Find the expected value, variance, and standard deviation of \(X\).
Exercise \(\PageIndex{2}\)
A random variable \(X\) has the distribution \[p_X = \pmatrix{ 0 & 1 & 2 & 4 \cr 1/3 & 1/3 & 1/6 & 1/6 \cr}\ .\] Find the expected value, variance, and standard deviation of \(X\).
Exercise \(\PageIndex{3}\)
You place a 1dollar bet on the number 17 at Las Vegas, and your friend places a 1dollar bet on black (see Exercises 1.1.6 and 1.1.7). Let \(X\) be your winnings and \(Y\) be her winnings. Compare \(E(X)\), \(E(Y)\), and \(V(X)\), \(V(Y)\). What do these computations tell you about the nature of your winnings if you and your friend make a sequence of bets, with you betting each time on a number and your friend betting on a color?
Exercise \(\PageIndex{4}\)
\(X\) is a random variable with \(E(X) = 100\) and \(V(X) = 15\). Find
 \(E(X^2)\).
 \(E(3X + 10)\).
 \(E(X)\).
 \(V(X)\).
 \(D(X)\).
Exercise \(\PageIndex{5}\)
In a certain manufacturing process, the (Fahrenheit) temperature never varies by more than \(2^\circ\) from \(62^\circ\). The temperature is, in fact, a random variable \(F\) with distribution \[P_F = \pmatrix{ 60 & 61 & 62 & 63 & 64 \cr 1/10 & 2/10 & 4/10 & 2/10 & 1/10 \cr}\ .\]
 Find \(E(F)\) and \(V(F)\).
 Define \(T = F  62\). Find \(E(T)\) and \(V(T)\), and compare these answers with those in part (a).
 It is decided to report the temperature readings on a Celsius scale, that is, \(C = (5/9)(F  32)\). What is the expected value and variance for the readings now?
Exercise \(\PageIndex{6}\)
Write a computer program to calculate the mean and variance of a distribution which you specify as data. Use the program to compare the variances for the following densities, both having expected value 0: \[p_X = \pmatrix{ 2 & 1 & 0 & 1 & 2 \cr 3/11 & 2/11 & 1/11 & 2/11 & 3/11 \cr}\ ;\] \[p_Y = \pmatrix{ 2 & 1 & 0 & 1 & 2 \cr 1/11 & 2/11 & 5/11 & 2/11 & 1/11 \cr}\ .\]
Exercise \(\PageIndex{7}\)
A coin is tossed three times. Let \(X\) be the number of heads that turn up. Find \(V(X)\) and \(D(X)\).
Exercise \(\PageIndex{8}\)
A random sample of 2400 people are asked if they favor a government proposal to develop new nuclear power plants. If 40 percent of the people in the country are in favor of this proposal, find the expected value and the standard deviation for the number \(S_{2400}\) of people in the sample who favored the proposal.
Exercise \(\PageIndex{9}\)
A die is loaded so that the probability of a face coming up is proportional to the number on that face. The die is rolled with outcome \(X\). Find \(V(X)\) and \(D(X)\).
Exercise \(\PageIndex{10}\)
Prove the following facts about the standard deviation.
 \(D(X + c) = D(X)\).
 \(D(cX) = cD(X)\).
Exercise \(\PageIndex{11}\)
A number is chosen at random from the integers 1, 2, 3, …, \(n\). Let \(X\) be the number chosen. Show that \(E(X) = (n + 1)/2\) and \(V(X) = (n  1)(n + 1)/12\). : The following identity may be useful: \[1^2 + 2^2 + \cdots + n^2 = \frac{(n)(n+1)(2n+1)}{6}\ .\]
Exercise \(\PageIndex{12}\)
Let \(X\) be a random variable with \(\mu = E(X)\) and \(\sigma^2 = V(X)\). Define \(X^* = (X  \mu)/\sigma\). The random variable \(X^*\) is called the associated with \(X\). Show that this standardized random variable has expected value 0 and variance 1.
Exercise \(\PageIndex{13}\)
Peter and Paul play Heads or Tails (see Example [exam 1.3]). Let \(W_n\) be Peter’s winnings after \(n\) matches. Show that \(E(W_n) = 0\) and \(V(W_n) = n\).
Exercise \(\PageIndex{14}\)
Find the expected value and the variance for the number of boys and the number of girls in a royal family that has children until there is a boy or until there are three children, whichever comes first.
Exercise \(\PageIndex{15}\)
Suppose that \(n\) people have their hats returned at random. Let \(X_i = 1\) if the \(i\)th person gets his or her own hat back and 0 otherwise. Let \(S_n = \sum_{i = 1}^n X_i\). Then \(S_n\) is the total number of people who get their own hats back. Show that
 \(E(X_i^2) = 1/n\).
 \(E(X_i \cdot X_j) = 1/n(n  1)\) for \(i \ne j\).
 \(E(S_n^2) = 2\) (using (a) and (b)).
 \(V(S_n) = 1\).
Exercise \(\PageIndex{16}\)
Let \(S_n\) be the number of successes in \(n\) independent trials. Use the program BinomialProbabilities (Section [sec 3.2]) to compute, for given \(n\), \(p\), and \(j\), the probability \[P(j\sqrt{npq} < S_n  np < j\sqrt{npq})\ .\]
 Let \(p = .5\), and compute this probability for \(j = 1\), 2, 3 and \(n = 10\), 30, 50. Do the same for \(p = .2\).
 Show that the standardized random variable \(S_n^* = (S_n  np)/\sqrt{npq}\) has expected value 0 and variance 1. What do your results from (a) tell you about this standardized quantity \(S_n^*\)?
Exercise \(\PageIndex{17}\)
Let \(X\) be the outcome of a chance experiment with \(E(X) = \mu\) and \(V(X) = \sigma^2\). When \(\mu\) and \(\sigma^2\) are unknown, the statistician often estimates them by repeating the experiment \(n\) times with outcomes \(x_1\), \(x_2\), …, \(x_n\), estimating \(\mu\) by the sample mean
\[\bar{x} = \frac 1n \sum_{i = 1}^n x_i\ ,\] a
nd \(\sigma^2\) by the sample variance
\[s^2 = \frac 1n \sum_{i = 1}^n (x_i  \bar x)^2\ .\]
Then \(s\) is the sample standard deviation. These formulas should remind the reader of the definitions of the theoretical mean and variance. (Many statisticians define the sample variance with the coefficient \(1/n\) replaced by \(1/(n1)\). If this alternative definition is used, the expected value of \(s^2\) is equal to \(\sigma^2\). See Exercise 6.2.19, part (d).)
Write a computer program that will roll a die \(n\) times and compute the sample mean and sample variance. Repeat this experiment several times for \(n = 10\) and \(n = 1000\). How well do the sample mean and sample variance estimate the true mean 7/2 and variance 35/12?
Exercise \(\PageIndex{18}\)
Show that, for the sample mean \(\bar x\) and sample variance \(s^2\) as defined in Exercise [exer 6.2.18],
 \(E(\bar x) = \mu\).
 \(E\bigl((\bar x  \mu)^2\bigr) = \sigma^2/n\).
 \(E(s^2) = \frac {n1}n\sigma^2\). : For (c) write \[\begin{aligned} \sum_{i = 1}^n (x_i  \bar x)^2 & = & \sum_{i = 1}^n \bigl((x_i  \mu)  (\bar x  \mu)\bigr)^2 \\ & = & \sum_{i = 1}^n (x_i  \mu)^2  2(\bar x  \mu) \sum_{i = 1}^n (x_i  \mu) + n(\bar x  \mu)^2 \\ & = & \sum_{i = 1}^n (x_i  \mu)^2  n(\bar x  \mu)^2,\end{aligned}\] and take expectations of both sides, using part (b) when necessary.
 Show that if, in the definition of \(s^2\) in Exercise [exer 6.2.18], we replace the coefficient \(1/n\) by the coefficient \(1/(n1)\), then \(E(s^2) = \sigma^2\). (This shows why many statisticians use the coefficient \(1/(n1)\). The number \(s^2\) is used to estimate the unknown quantity \(\sigma^2\). If an estimator has an average value which equals the quantity being estimated, then the estimator is said to be unbiased. Thus, the statement \(E(s^2) = \sigma^2\) says that \(s^2\) is an unbiased estimator of \(\sigma^2\).)
Exercise \(\PageIndex{19}\)
Let \(X\) be a random variable taking on values \(a_1\), \(a_2\), …, \(a_r\) with probabilities \(p_1\), \(p_2\), …, \(p_r\) and with \(E(X) = \mu\). Define the of spread \(X\) as follows: \[\bar\sigma = \sum_{i = 1}^r a_i  \mup_i\ .\] This, like the standard deviation, is a way to quantify the amount that a random variable is spread out around its mean. Recall that the variance of a sum of mutually independent random variables is the sum of the individual variances. The square of the spread corresponds to the variance in a manner similar to the correspondence between the spread and the standard deviation. Show by an example that it is not necessarily true that the square of the spread of the sum of two independent random variables is the sum of the squares of the individual spreads.
Exercise \(\PageIndex{20}\)
We have two instruments that measure the distance between two points. The measurements given by the two instruments are random variables \(X_1\) and \(X_2\) that are independent with \(E(X_1) = E(X_2) = \mu\), where \(\mu\) is the true distance. From experience with these instruments, we know the values of the variances \(\sigma_1^2\) and \(\sigma_2^2\). These variances are not necessarily the same. From two measurements, we estimate \(\mu\) by the weighted average \(\bar \mu = wX_1 + (1  w)X_2\). Here \(w\) is chosen in \([0,1]\) to minimize the variance of \(\bar \mu\).
 What is \(E(\bar \mu)\)?
 How should \(w\) be chosen in \([0,1]\) to minimize the variance of \(\bar \mu\)?
Exercise \(\PageIndex{21}\)
Let \(X\) be a random variable with \(E(X) = \mu\) and \(V(X) = \sigma^2\). Show that the function \(f(x)\) defined by \[f(x) = \sum_\omega (X(\omega)  x)^2 p(\omega)\] has its minimum value when \(x = \mu\).
Exercise \(\PageIndex{22}\)
Let \(X\) and \(Y\) be two random variables defined on the finite sample space \(\Omega\). Assume that \(X\), \(Y\), \(X + Y\), and \(X  Y\) all have the same distribution. Prove that \(P(X = Y = 0) = 1\).
Exercise \(\PageIndex{23}\)
If \(X\) and \(Y\) are any two random variables, then the covariance of \(X\) and \(Y\) is defined by Cov\((X,Y) = E((X  E(X))(Y  E(Y)))\). Note that Cov\((X,X) = V(X)\). Show that, if \(X\) and \(Y\) are independent, then Cov\((X,Y) = 0\); and show, by an example, that we can have Cov\((X,Y) = 0\) and \(X\) and \(Y\) not independent.
Exercise \(\PageIndex{24}\)
A professor wishes to make up a truefalse exam with \(n\) questions. She assumes that she can design the problems in such a way that a student will answer the \(j\)th problem correctly with probability \(p_j\), and that the answers to the various problems may be considered independent experiments. Let \(S_n\) be the number of problems that a student will get correct. The professor wishes to choose \(p_j\) so that \(E(S_n) = .7n\) and so that the variance of \(S_n\) is as large as possible. Show that, to achieve this, she should choose \(p_j = .7\) for all \(j\); that is, she should make all the problems have the same difficulty.
Exercise \(\PageIndex{25}\)
(Lamperti^{20}) An urn contains exactly 5000 balls, of which an unknown number \(X\) are white and the rest red, where \(X\) is a random variable with a probability distribution on the integers 0, 1, 2, …, 5000.
 Suppose we know that \(E(X) = \mu\). Show that this is enough to allow us to calculate the probability that a ball drawn at random from the urn will be white. What is this probability?
 We draw a ball from the urn, examine its color, replace it, and then draw another. Under what conditions, if any, are the results of the two drawings independent; that is, does \[P(white,white) = P(white)^2 ?\]
 Suppose the variance of \(X\) is \(\sigma^2\). What is the probability of drawing two white balls in part (b)?
Exercise \(\PageIndex{26}\)
For a sequence of Bernoulli trials, let \(X_1\) be the number of trials until the first success. For \(j \geq 2\), let \(X_j\) be the number of trials after the \((j  1)\)st success until the \(j\)th success. It can be shown that \(X_1\), \(X_2\), …is an independent trials process.
 What is the common distribution, expected value, and variance for \(X_j\)?
 Let \(T_n = X_1 + X_2 + \cdots + X_n\). Then \(T_n\) is the time until the \(n\)th success. Find \(E(T_n)\) and \(V(T_n)\).
 Use the results of (b) to find the expected value and variance for the number of tosses of a coin until the \(n\)th occurrence of a head.
Exercise \(\PageIndex{27}\)
Referring to Exercise 6.1.30, find the variance for the number of boxes of Wheaties bought before getting half of the players’ pictures and the variance for the number of additional boxes needed to get the second half of the players’ pictures.
Exercise \(\PageIndex{28}\)
In Example 5.1.3, assume that the book in question has 1000 pages. Let \(X\) be the number of pages with no mistakes. Show that \(E(X) = 905\) and \(V(X) = 86\). Using these results, show that the probability is \({} \leq .05\) that there will be more than 924 pages without errors or fewer than 866 pages without errors.
Exercise \(\PageIndex{29}\)
Let \(X\) be Poisson distributed with parameter \(\lambda\). Show that \(V(X) = \lambda\).'6.2'});