# 3.7: Variance of Discrete Random Variables

- Page ID
- 4373

We now look at our second numerical characteristic associated to random variables.

Definition\(\PageIndex{1}\)

The **variance** of a random variable \(X\) is given by

$$\sigma^2 = Var(X) = E[(X-\mu)^2],\notag$$

where \(\mu\) denotes the expected value of \(X\). The **standard deviation** of \(X\) is given by

$$\sigma = \text{SD}(X) = \sqrt{Var(X)}.\notag$$

In words, the variance of a random variable is the average of the squared deviations of the random variable from its mean (or expected value). Notice that the variance of a random variable will result in a number with units squared, but the standard deviation will have the same units as the random variable. Thus, the standard deviation is easier to interpret, which is why we make a point to define it. The variance and standard deviation give us a *measure of spread* for random variables. The standard deviation is interpreted as a measure of how "spread out'' the possible values of \(X\) are with respect to the mean of \(X\).

As with expected values, for many of the common probability distributions, the variance is given by a parameter or a function of the parameters for the distribution. For example, if continuous random variable \(X\) has a normal distribution with parameters \(\mu\) and \(\sigma\), then \(Var(X) = \sigma^2\), i.e., the parameter \(\sigma\) gives the standard deviation. Again, the normal case explains the notation used for variance and standard deviation.

Example \(\PageIndex{1}\)

Suppose \(X_1\sim\text{normal}(0, 2^2)\) and \(X_2\sim\text{normal}(0, 3^2)\). So, \(X_1\) and \(X_2\) are both normally distributed random variables with the same mean, but \(X_2\) has a larger standard deviation. Given our interpretation of standard deviation, this implies that the possible values of \(X_2\) are more "spread out'' from the mean. This is easily seen by looking at the graphs of the pdf's corresponding to \(X_1\) and \(X_2\) given in Figure 1.

Figure 1: Graph of normal pdf's: \(X_1\sim\text{normal}(0,2^2)\) in blue, \(X_2\sim\text{normal}(0,3^2)\) in red

Theorem 3.7.1 tells us how to compute variance, since it is given by finding the expected value of a function applied to the random variable. First, if \(X\) is a discrete random variable with possible values \(x_1, x_2, \ldots, x_i, \ldots\), and frequency function \(p(x_i)\), then the variance of \(X\) is given by

$$Var(X) = \sum_{i} (x_i - \mu)^2\cdot p(x_i).\notag$$

If \(X\) is continuous with pdf \(f(x)\), then

$$Var(X) = \int\limits^{\infty}_{-\infty}\! (x-\mu)^2\cdot f(x)\, dx.\notag$$

The above formulas follow directly from Definition 3.8.1. However, there is an alternate formula for calculating variance, given by the following theorem, that is often easier to use.

Theorem \(\PageIndex{1}\)

\(Var(X) = E[X^2] - \mu^2\)

**Proof**-
By the definition of

*variance*,\begin{align*}

Var(X)&= E[(X-\mu)^2]\\

&= E[X^2+\mu^2-2X\mu]\\

&= E[X^2]+E[\mu^2]-E[2X\mu]\\

&= E[X^2] + \mu^2-2\mu E[X] (\text{Note: since \(mu\) is constant, we can take it out from the expected value})\\

&= E[X^2] + \mu^2-2\mu^2\\

&= E[X^2] -\mu^2

\end{align*}

Example \(\PageIndex{2}\)

Continuing in the context of Example 23, we calculate the variance and standard deviation of the random variable \(X\) denoting the number of heads obtained in two tosses of a fair coin. Using the alternate formula for variance, we need to first calculate \(E[X^2]\), for which we use Theorem 3.8.1:

$$E[X^2] = 0^2\cdot p(0) + 1^2\cdot p(1) + 2^2\cdot p(2) = 0 + 0.5 + 1 = 1.5.\notag$$

In Example 23, we found that \(\mu = E[X] = 1\). Thus, we find

\begin{align*}

Var(X) &= E[X^2] - \mu^2 = 1.5 - 1 = 0.5 \\

\Rightarrow\ \text{SD}(X) &= \sqrt{Var(X)} = \sqrt{0.5} \approx 0.707

\end{align*}

Example \(\PageIndex{3}\)

Continuing with Example 24, we calculate the variance and standard deviation of the random variable \(X\) denoting the time a person waits for an elevator to arrive. Again, we use the alternate formula for variance and first find \(E[X^2]\) using Theorem 3.8.1: **CONTINUES**

$$E[X^2] = \int\limits^1_0\! x^2\cdot x\, dx + \int\limits^2_1\! x^2\cdot (2-x)\, dx = \int\limits^1_0\! x^3\, dx + \int\limits^2_1\! (2x^2 - x^3)\, dx = \frac{1}{4} + \frac{11}{12} = \frac{7}{6}.\notag$$

In Example 24, we found that \(\mu = E[X] = 1\). Thus, we have

\begin{align*}

Var(X) &= E[X^2] - \mu^2 = \frac{7}{6} - 1 = \frac{1}{6} \\

\Rightarrow\ \text{SD}(X) &= \sqrt{Var(X)} = \frac{1}{\sqrt{6}} \approx 0.408

\end{align*}

Given that the variance of a random variable is defined to be the expected value of* squared* deviations from the mean, variance is not linear as expected value is. We do have the following useful property of variance though.

Theorem \(\PageIndex{2}\)

Let \(X\) be a random variable, and \(a, b\) be constants. Then the following holds:

$$Var(aX + b) = a^2Var(X).\notag$$

**Proof**-
Add proof here and it will automatically be hidden

Theorem 3.8.2 easily follows from a little algebraic modification. Note that the "\(+\ b\)'' disappears in the formula. There is an intuitive reason for this. Namely, the "\(+\ b\)'' corresponds to a *horizontal shift* of the frequency function or pdf of the random variable. Such a transformation to either of these functions is not going to affect the *spread*, i.e., the variance will not change.