# 12.1: Variance

- Page ID
- 10834

In the treatment of the mathematical expection of a real random variable \(X\), we note that the mean value locates the center of the probability mass distribution induced by \(X\) on the real line. In this unit, we examine how expectation may be used for further characterization of the distribution for \(X\). In particular, we deal with the concept of *variance* and its square root the *standard deviation*. In subsequent units, we show how it may be used to characterize the distribution for a pair \(\{X, Y\}\) considered jointly with the concepts *covariance*, and *linear regression*

## Variance

Location of the center of mass for a distribution is important, but provides limited information. Two markedly different random variables may have the same mean value. It would be helpful to have a measure of the spread of the probability mass about the mean. Among the possibilities, the variance and its square root, the standard deviation, have been found particularly useful.

Definition: Variance & Standard Deviation

The *variance* of a random variable \(X\) is the mean square of its variation about the mean value:

\(\text{Var } [X] = \sigma_X^2 = E[(X - \mu_X)^2]\) where \(\mu_X = E[X]\)

The *standard deviation* for *X* is the positive square root \(\sigma_X\) of the variance.

**Remarks**

- If \(X(\omega)\) is the observed value of \(X\), its variation from the mean is \(X(\omega) - \mu_X\). The variance is the probability weighted average of the square of these variances.
- The square of the error treats positive and negative variations alike, and it weights large variations more heavily than smaller ones.
- As in the case of mean value, the variance is a property of the distribution, rather than of the random variable.
- We show below that the standard deviation is a “natural” measure of the variation from the mean.
- In the treatment of mathematical expectation, we show that

\(E[(X - c)^2]\) is a minimum off \(c = E[X]\), in which case \(E[(X - E[X])^2] = E[X^2] - E^2[X]\)

This shows that the mean value is the constant which best approximates the random variable, in the mean square sense.

**Basic patterns for variance**

Since variance is the expectation of a function of the random variable *X*, we utilize properties of expectation in computations. In addition, we find it expedient to identify several patterns for variance which are frequently useful in performing calculations. For one thing, while the variance is defined as \(E[(X - \mu_X)^2]\), this is usually not the most convenient form for computation. The result quoted above gives an alternate expression.

(V1): *Calculating formula*. \(\text{Var } [X] = E[X^2] - E^2[X]\)

(V2): *Shift property*. \(\text{Var } [X + b] = \text{Var } [X]\). Adding a constant \(b\) to \(X\) shifts the distribution (hence its center of mass) by that amount. The variation of the shifted distribution about the shifted center of mass is the same as the variation of the original, unshifted distribution about the original center of mass.

(V3): *Change of scale*. \(\text{Var } [aX] = a^2\text{Var }[X]\). Multiplication of \(X\) by constant a changes the scale by a factor \([a]\). The squares of the variations are multiplied by \(a^2\). So also is the mean of the squares of the variations.

(V4): *Linear combinations*.

a. \(\text{Var }[aX \pm bY] = a^2\text{Var }[X] + b^2 \text{Var } [Y] \pm 2ab(E[XY] - E[X]E[Y])\)

b. More generally,

\(\text{Var } [\sum_{k = 1}^{n} a_k X_k] = \sum_{k = 1}^{n} a_k^2 \text{Var }[X_k] + 2\sum_{i < j} a_i a_j (E[X_i X_j] - E[X_i] E[X_j])\)

The term\(c_{ij} = E[X_i X_j] - E[X_i] E[X_j]\) is the covariance of the pair \(\{X_i, X_j\}\), whose role we study in the unit on that topic. If the \(c_{ij}\) are all zero, we say the class is *uncorrelated.*

*Remarks*

- If the pair \(\{X, Y\}\) is independent, it is uncorrelated. The converse is not true, as examples in the next section show.
- If the \(a_i = \pm 1\) and all pairs are uncorrelated, then

\(\text{Var }[\sum_{k = 1}^{n} a_i X_i] = \sum_{k = 1}^{n} \text{Var } [X_i]\)

The variance add even if the coefficients are negative.

We calculate variances for some common distributions. Some details are omitted—usually details of algebraic manipulation or the straightforward evaluation of integrals. In some cases we use well known sums of infinite series or values of definite integrals. A number of pertinent facts are summarized in __Appendix B__. Some Mathematical Aids. The results below are included in the table in __Appendix C__.

**Variances of some discrete distributions**

*Indicator function* \(X = I_E P(E) = p, q = 1 - p\) \(E[X] = p\)

\(E[X^2] - E^2[X] = E[I_E^2] - p^2 = E[I_E] - p^2 = p - p^2 = p(1 - p) - pq\)

*Simple random variable* \(X = \sum_{i = 1}^{n} t_i I_{A_i}\) (primitive form) \(P(A_i) = p_i\).

\(\text{Var }[X] = \sum_{i = 1}^{n} t_i^2 p_i q_i - 2 \sum_{i < j} t_i t_j p_i p_j\), since \(E[I_{A_i} I_{A_j}] = 0\) \(i \ne j\)

*Binomial*(\(n, p\)). \(X = \sum_{i = 1}^{n} I_{E_i}\) with \(\{I_{E_i}: 1 \le i \le n\}\) iid \(P(E_i) = p\)

\(\text{Var }[X] = \sum_{i = 1}^{n} \text{Var }[I_{E_i}] = \sum_{i = 1}^{n} pq = npq\)

*Geometric*(\(p\)). \(P(X = k) = pq^k\) \(\forall k \ge 0\) \(E[X] = q/p\)

We use a trick: \(E[X^2] = E[X(X - 1)] + E[X]\)

\(E[X^2] = p\sum_{k = 0}^{\infty} k(k - 1)q^k + q/p = pq^2 \sum_{k = 2}^{\infty} k(k - 1)q^{k - 2} + q/p = pq^2 \dfrac{2}{(1 - q)^3} + q/p = 2\dfrac{q^2}{p^2} + q/p\)

\(\text{Var }[X] = 2\dfrac{q^2}{p^2} + q/p - (q/p)^2 = q/p^2\)

*Poisson*(\mu) \(P(X = k) = e^{-\mu} \dfrac{\mu^k}{k!}\) \(\forall k \ge 0\)

Using \(E[X^2] = E[X(X - 1)] + E[X]\), we have

\(E[X^2] = e^{-\mu} \sum_{k = 2}^{\infty} k(k - 1) \dfrac{\mu^k}{k!} + \mu = e^{-\mu} \mu^2 \sum_{k = 2}^{\infty} \dfrac{\mu^{k - 2}}{(k - 2)!} + \mu = \mu^2 + \mu\)

Thus, \(\text{Var }[X] = \mu^2 + \mu - \mu^2 = \mu\). Note that both the mean and the variance have common value \(\mu\)

**Some absolutely continuous distributions**

*Uniform* on \((a, b)f_X(t) = \dfrac{1}{b - a}\) \(a < t < b\) \(E[X] = \dfrac{a + b}{2}\)

\(E[X^2] = \dfrac{1}{b - a} \int_a^b t^2\ dt = \dfrac{b^3 - a^3}{3(b - a)}\) so \(\text{Var }[X] = \dfrac{b^3 - a^3}{3(b - a)} - \dfrac{(a + b)^2}{4} = \dfrac{(b - a)^2}{12}\)

*Symmetric **triangular *\((a, b)\) Because of the shift property __(V2)__, we may center the distribution at the origin. Then the distribution is symmetric triangular \((-c, c)\), where \(c = (b- a)/2\). Because of the symmetry

\(\text{Var }[X] = E[X^2] = \int_{-c}^{c} t^2f_X(t)\ dt = 2\int_{0}^{c} t^2 f_X (t)\ dt\)

Now, in this case,

\(f_X (t) = \dfrac{c - t}{c^2}\) \(0 \le t \le c\) so that \(E[X^2] = \dfrac{2}{c^2} \int_{0}^{c} (ct^2 - t^3)\ dt = \dfrac{c^3}{6} = \dfrac{(b - a)^2}{24}\)

*Exponential* (\lambda) \(f_X (t) = \lambda e^{-\lambda t}\), \(t \ge 0\) \(E[X] = 1/\lambda\)

\(E[X^2] = \int_{0}^{\infty} \lambda t^2 e^{-\lambda t} \ dt = \dfrac{2}{\lambda^2}\) so that \(\text{Var }[X] = 1/lambda^2\)

*Gamma*(\(\alpha, \lambda\)) \(f_{X} (t) = \dfrac{1}{\Gamma(\alpha)} \lambda^{\alpha} t^{\alpha = 1} e^{-\lambda t}\) \(t \ge 0\) \(E[X] = \dfrac{\alpha}{\lambda}\)

\(E[X^2] = \dfrac{1}{\Gamma (\alpha)} \int_{0}^{\infty} \lambda^{\alpha} t^{\alpha + 1} e^{-\lambda t}\ dt = \dfrac{\Gamma (\alpha + 2)}{\lambda^2 \Gamma(\alpha)} = \dfrac{\alpha (\alpha + 1)}{lambda^2}\)

Hence \(\text{Var } [X] = \alpha/\lambda^2\).

*Normal*(\(\mu, \sigma^2\)) \(E[X] = \mu\)

Consider \(Y\) ~ \(N(0, 1)\), \(E[Y] = 0\), \(\text{Var }[Y] = \dfrac{2}{\sqrt{2\pi}} \int_{0}^{\infty} t^2 e^{-t^2/2} \ dt = 1\).

\(X = \sigma Y + \mu\) implies \(\text{Var }[Y] = \sigma^2\)

**Extensions of some previous examples**

In the unit on expectations, we calculate the mean for a variety of cases. We revisit some of those examples and calculate the variances.

Example \(\PageIndex{1}\) Expected winnings (Example 8 from "Mathematical Expectation: Simple Random Variables")

A bettor places three bets at $2.00 each. The first pays $10.00 with probability 0.15, the second $8.00 with probability 0.20, and the third $20.00 with probability 0.10.

**Solution**

The net gain may be expressed

\(X = 10 I_A + 8I_B + 20I_C - 6\), with \(P(A) = 0.15, P(B) = 0.20, P(C) = 0.10\)

We may reasonbly suppose the class \(\{A, B, C\}\) is independent (this assumption is not necessary in computing the mean). Then

\(\text{Var }[X] = 10^2 P(A) [1 - P(A)] + 8^2 P(B)[1 - P(B)] + 20^2 P(C) [1 - P(C)]\)

Calculation is straightforward. We may use MATLAB to perform the arithmetic.

c = [10 8 20]; p = 0.01*[15 20 10]; q = 1 - p; VX = sum(c.^2.*p.*q) VX = 58.9900

Example \(\PageIndex{2}\) A function of \(X\) (Example 9 from "Mathematical Expectation: Simple Random Variables")

Suppose \(X\) in a primitive form is

\(X = -3I_{C_1} - I_{C_2} + 2I_{C_3} - 3I_{C_4} + 4I_{C_5} - I_{C_6} + I_{C_7} + 2I_{C_8} + 3I_{C_9} + 2I_{C_{10}}\)

with probabilities \(P(C_i) = 0.08, 0.11, 0.06, 0.13, 0.05, 0.08, 0.12, 0.07, 0.14, 0.16\).

Let \(g(t) = t^2 + 2t\). Determine \(E[g(X)]\) and \(\text{Var}[g(X)]\)

c = [-3 -1 2 -3 4 -1 1 2 3 2]; % Original coefficients pc = 0.01*[8 11 6 13 5 8 12 7 14 16]; % Probabilities for c_j G = c.^2 + 2*c % g(c_j) EG = G*pc' % Direct calculation E[g(X)] EG = 6.4200 VG = (G.^2)*pc' - EG^2; % Direct calculation Var[g(X)] VG = 40.8036 [Z,PZ] = csort(G,pc); % Distribution for Z = g(X) EZ = Z*PZ' % E[Z] EZ = 6.4200 VZ = (Z.^2)*PZ' - EZ^2 % Var[Z] VZ = 40.8036

Example \(\PageIndex{3}\) \(Z = g(X, Y)\) (Example 10 from "Mathematical Expectation: Simple Random Variables")

We use the same joint distribution as for Example 10 from "Mathematical Expectation: Simple Random Variables" and let \(g(t, u) = t^2 + 2tu - 3u\). To set up for calculations, we use jcalc.

jdemo1 % Call for data jcalc % Set up Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X, Y, PX, PY, t, u, and P G = t.^2 + 2*t.*u - 3*u; % calcculation of matrix of [g(t_i, u_j)] EG = total(G.*P) % Direct calculation of E[g(X,Y)] EG = 3.2529 VG = total(G.^.*P) - EG^2 % Direct calculation of Var[g(X,Y)] VG = 80.2133 [Z,PZ] = csort(G,P); % Determination of distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 3.2529 VZ = (Z.^2)*PZ' - EZ^2 % Var[Z] from distribution VZ = 80.2133

Example \(\PageIndex{4}\) A function with compound definition (Example 12 from "Mathematical Expectation: Simple Random Variables")

Suppose \(X) ~ exponential (0.3). Let

\(Z = \begin{cases} X^2 & \text{for } X \le 4 \\ 16 & \text{for } X > 4 \end{cases} = I_{[0,4]} (X) X^2 + I_{(4, \infty]} (X) 16\)

Determine \(E[Z]\) and \(Var[Z]\).

**Analytic Solution**

\(E[g(X)] = \int g(t) f_X(t)\ dt = \int_{0}^{\infty} I_{[0, 4]} (t) t^2 0.3 e^{-0.3t}\ dt + 16 E[I_{(4, \infty]} (X)]\)

\(= \int_{0}^{4} t^2 0.3 e^{-0.3t} \ dt + 16 P(X > 4) \approx 7.4972\) (by Maple)

\(Z^2 - I_{[0, 4]} (X) X^4 + I_{(4, \infty]} (X) 256\)

\(E[Z^2] = \int_{0}^{\infty} I_{[0,4]} (t) t^4 0.3 e^{-0.3t}\ dt + 256 E[I_{(4, \infty]} (X)] = \int_{0}^{4} t^4 0.3 e^{-0.3t}\ dt + 256 e^{-1.2} \approx 100.0562\)

\(\text{Var } [Z] = E[Z^2] - E^2[Z] \approx 43.8486\) (by Maple)

APPROXIMATION

To obtain a simple aproximation, we must approximate by a bounded random variable. Since \(P(X > 50) = e^{-15} \approx 3 \cdot 10^{-7}\) we may safely truncate \(X\) at 50.

tuappr Enter matrix [a b] of x-range endpoints [0 50] Enter number of x approximation points 1000 Enter density as a function of t 0.3*exp(-0.3*t) Use row matrices X and PX as in the simple case M = X <= 4; G = M.*X.^2 + 16*(1 - M); % g(X) EG = G*PX' % E[g(X)] EG = 7.4972 VG = (G.^2)*PX' - EG^2 % Var[g(X)] VG = 43.8472 % Theoretical = 43.8486 [Z,PZ] = csort(G,PX); % Distribution for Z = g(X) EZ = Z*PZ' % E[Z] from distribution EZ = 7.4972 VZ = (Z.^2)*PZ' - EZ^2 % Var[Z] VZ = 43.8472

Example \(\PageIndex{5}\) Stocking for random demand (Example 13 from "Mathematical Expectation: Simple Random Variables")

The manager of a department store is planning for the holiday season. A certain item costs \(c\) dollars per unit and sells for \(p\) dollars per unit. If the demand exceeds the amount \(m\) ordered, additional units can be special ordered for \(s\) dollars per unit (\(s >c\)). If demand is less than the amount ordered, the remaining stock can be returned (or otherwise disposed of) at \(r\) dollars per unit (\(r < c\)). Demand \(D\) for the season is assumed to be a random variable with Poisson (\(\mu\)) distribution. Suppose \(\mu = 50\), \(c = 30\), \(p = 50\), \(s = 40\), \(r = 20\). What amount \(m\) should the manager order to maximize the expected profit?

**Problem Formulation**

Suppose \(D\) is the demand and \(X\) is the profit. Then

For \(D \le m\), \(X = D(p - c) - (m - D)(c - r) = D(p - r) + m(r - c)\)

For \(D > m\), \(X = m(p - c) - (D - m)(p - s) = D(p - s) + m(s - c)\)

It is convenient to write the expression for \(X\) in terms of \(I_M\), where \(M = (-\infty, M]\). Thus

\(X = I_M (D) [D(p - r) + m(r - c)] + [1 - I_M(D)][D(p - s) + m(s - c)]\)

\(= D(p - s) + m(s - c) + I_M (D) [D(p - r) + m(r - c) - D(p - s) - m(s - c)]\)

\(D(p - s) + m(s - c) + I_M(D) (s- r)[D - m]\)

Then

\(E[X] = (p - s) E[D] + m(s - c) + (s - r) E[I_M(D) D] - (s - r) mE[I_M(D)]\)

We use the discrete approximation.

APPROXIMATION

>> mu = 50; >> n = 100; >> t = 0:n; >> pD = ipoisson(mu,t); % Approximate distribution for D >> c = 30; >> p = 50; >> s = 40; >> r = 20; >> m = 45:55; >> for i = 1:length(m) % Step by step calculation for various m M = t<=m(i); G(i,:) = (p-s)*t + m(i)*(s-c) + (s-r)*M.*(t - m(i)); end >> EG = G*pD'; >> VG = (G.^2)*pD' - EG.^2; >> SG = sqrt(VG); >> disp([EG';VG';SG']') 1.0e+04 * 0.0931 1.1561 0.0108 0.0936 1.3117 0.0115 0.0939 1.4869 0.0122 0.0942 1.6799 0.0130 0.0943 1.8880 0.0137 0.0944 2.1075 0.0145 0.0943 2.3343 0.0153 0.0941 2.5637 0.0160 0.0938 2.7908 0.0167 0.0934 3.0112 0.0174 0.0929 3.2206 0.0179

Example \(\PageIndex{6}\) A jointly distributed pair (Example 14 from "Mathematical Expectation: Simple Random Variables")

Suppose the pair \(\{X, Y\}\) has joint density \(f_{XY} (t, u) = 3u\) on the triangular region bounded by \(u = 0\), \(u = 1 + t\), \(u = 1 - t\). Let \(Z = g(X, Y) = X^2 + 2XY\).

Determine \(E[Z]\) and \(\text{Var }[Z]\).

**Analytic Solution**

\(E[Z] = \int \int (t^2 + 2tu) f_{XY} (t, u) \ dudt = 3\int_{-1}^{0} \int_{0}^{1 + t} u(t^2 + 2tu)\ dudt + 3 \int_{0}^{1} \int_{0}^{1 - t} u(t^2 + 2tu)\ dudt = 1/10\)

\(E[Z^2] = 3\int_{-1}^{0} \int_{0}^{1 + t} u(t^2 + 2tu)^2 \ dudt + 3\int_{0}^{1} \int_{0}^{1 - t} u(t^2 + 2tu)^2 \ dudt = 3/35\)

\(\text{Var } [Z] = E[Z^2] -E^2[Z] = 53/700 \approx 0.0757\)

APPROXIMATION

tuappr Enter matrix [a b] of x-range endpoints [-1 1] Enter matrix [c d] of Y-range endpoints [0 1] Enter number of X approximation points 400 Enter number of Y approximation points 200 Enter expression for joint density 3*u.*(u<=min(1+t,1-t)) Use array operations on X, Y, PX, PY, t, u, and P G = t.^2 + 2*t.*u; % g(X,Y) = X^2 + 2XY EG = total(G.*P) % E[g(X,Y)] EG = 0.1006 % Theoretical value = 1/10 VG = total(G.^2.*P) - EG^2 VG = 0.0765 % Theoretical value 53/700 = 0.757 [Z,PZ] = csort(G,P); % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 0.1006 VZ = (Z.^2)*PZ' - EZ^2 VZ = 0.0765

Example \(\PageIndex{7}\) A function with compound definition (Example 15 from "Mathematical Expectation: Simple Random Variables")

The pair \(\{X, Y\}\) has joint density \(f_{XY} (t, u) = 1/2\) on the square region bounded by \(u = 1 + t\), \(u = 1 - t\), \(u = 3 - t\), and \(u = t - 1\).

\(W = \begin{cases} X & \text{for max }\{X, Y\} \le 1 \\ 2Y & \text{for max } \{X, Y\} > 1 \end{cases} = I_Q(X, Y) X + I_{Q^c} (X, Y) 2Y\)

where \(Q = \{(t, u): \text{max } \{t, u\} \le 1 \} = \{(t, u): t \le 1, u \le 1\}\).

Determine \(E[W]\) and \(\text{Var } [W]\).

**Solution**

The intersection of the region \(Q\) and the square is the set for which \(0 \le t \le 1\) and \(1 - t \le u \le 1\). Reference to Figure 11.3.2 shows three regions of integration.

\(E[W] = \dfrac{1}{2} \int_{0}^{1} \int_{1 - t}^{1} t \ dudt + \dfrac{1}{2} \int_{0}^{1} \int_{1}^{1 + t} 2u \ dudt + \dfrac{1}{2} \int_{1}^{2} \int_{t - 1}^{3 - t} 2u\ dudt = 11/6 \approx 1.8333\)

\(E[W^2] = \dfrac{1}{2} \int_{0}^{1} \int_{1 - t}^{1} t^2\ dudt + \dfrac{1}{2} \int_{0}^{1} \int_{1}^{1 + t} 4u^2 \ dudt + \dfrac{1}{2} \int_{1}^{2} \int_{t - 1}^{3 - t} 4u^2 \ dudt = 103/24\)

\(\text{Var } [W] = 103/24 - (11/6)^2 = 67/72 \approx 0.9306\)

tuappr Enter matrix [a b] of x-range endpoints [0 2] Enter matrix [c d] of Y-range endpoints [0 2] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density ((u<=min(t+1,3-t))& ... (u$gt;=max(1-t,t-1))/2 Use array operations on X, Y, PX, PY, t, u, and P M = max(t,u)<=1; G = t.^M + 2*u.*(1 - M); %Z = g(X,Y) EG = total(G.*P) % E[g(X,Y)] EG = 1.8349=0 % Theoretical 11/6 = 1.8333 VG = total(G.^2.*P) - EG^2 VG = 0.9368 % Theoretical 67/72 = 0.9306 [Z,PZ] = csort(G,P); % Distribution for Z EZ = Z*PZ' % E[Z] from distribution EZ = 1.8340 VZ = (Z.^2)*PZ' - EZ^2 VZ = 0.9368

Example \(\PageIndex{8}\) A function with compound definition

\(f_{XY} (t, u) = 3\) on \(0 \le u \le t^2 \le 1\)

\(Z = I_Q (X, Y)X + I_{Q^c} (X, Y)\) for \(Q = \{(t, u): u + t \le 1\}\)

The value \(t_0\) where the line \(u = 1 - t\) and the curve \(u = t^2\) meet satisfies \(t_0^2 = 1 - t_0\).

\(E[Z] = 3 \int_{0}^{t_0} t \int_{0}^{t^2} \ dudt + 3 \int_{t_0}^{1} t \int_{0}^{1 - t} \ dudt + 3 \int_{t_0}^{1} \int_{1 - t}^{t^2} \ dudt = \dfrac{3}{4} (5t_0 - 2)\)

For \(E[Z^2]\) replace \(t\) by \(t^2\) in the integrands to get \(E[Z^2] = (25t_0 - 1)/20\).

Using \(t_0 = (\sqrt{5} - 1)/2 \approx 0.6180\), we get \(\text{Var }[Z] = (2125t_0 - 1309)/80 \approx 0.0540\).

APPROXIMATION

% Theoretical values t0 = (sqrt(5) - 1)/2 t0 = 0.6180 EZ = (3/4)*(5*t0 - 2) EZ = 0.8176 EZ2 = (25*t0 - 1)/20 EZ2 = 0.7225 VZ = (2125*T0 - 1309)/80 VZ = 0.0540 tuappr Enter matrix [a b] of x-range endpoints [0 1] Enter matrix [c d] of Y-range endpoints [0 1] Enter number of X approximation points 200 Enter number of Y approximation points 200 Enter expression for joint density 3*(u <= t.^2) Use array operations on X, Y, t, u, and P G = (t+u <= 1).*t + (t+u > 1); EG = total(G.*P) EG = 0.8169 % Theoretical = 0.8176 VG = total(G.^2.*P) - EG^2 VG = 0.0540 % Theoretical = 0.0540 [Z,PZ] = csort(G,P); EZ = Z*PZ' EZ = 0.8169 VZ = (Z.^2)*PZ' - EZ^2 VZ = 0.0540

**Standard deviation and the Chebyshev inequality**

In Example 5 from "Functions of a Random Variable," we show that if \(X\) ~ \(N(\mu, \sigma^2)\), then \(Z = \dfrac{X - \mu}{\sigma}\) ~ \(N(0, 1)\). Also, \(E[X] = \mu\) and \(\text{Var } [X] = \sigma^2\). Thus

\(P(\dfrac{|X - \mu|}{\sigma} \le t) = P(|X - \mu| \le t \sigma) = 2 \phi (t) - 1\)

For the normal distribution, the standard deviation \(\sigma\) seems to be a natural measure of the variation away from the mean.

For a general distribution with mean \(\mu\) and variance \(\sigma^2\), we have the

*Chebyshev inequality*

\(P(\dfrac{|X - \mu|}{\sigma} \ge a) \le \dfrac{1}{a^2}\) or \(P(|X - \mu| \ge a \sigma) \le \dfrac{1}{a^2}\)

In this general case, the standard deviation appears as a measure of the variation from the mean value. This inequality is useful in many theoretical applications as well as some practical ones. However, since it must hold for any distribution which has a variance, the bound is not a particularly tight. It may be instructive to compare the bound on the probability given by the Chebyshev inequality with the actual probability for the normal distribution.

t = 1:0.5:3; p = 2*(1 - gaussion(0.1,t)); c = ones(1,length(t))./(t.^2); r = c./p; h = [' t Chebyshev Prob Ratio']; m = [t;c;p;r]'; disp(h) t Chebyshev Prob Ratio disp(m) 1.0000 1.0000 0.3173 3.1515 1.5000 0.4444 0.1336 3.3263 2.0000 0.2500 0.0455 5.4945 2.5000 0.1600 0.0124 12.8831 3.0000 0.1111 0.0027 41.1554

— □

DERIVATION OF THE CHEBYSHEV INEQUALITY

Let \(A = \{|X - \mu| \ge a \sigma\} = \{(X - \mu)^2 \ge a^2 \sigma^2\}\). Then \(a^2 \sigma^2 I_A \le (X - \mu)^2\).

Upon taking expectations of both sides and using monotonicity, we have

\(a^2 \sigma^2 P(A) \le E[(X - \mu)^2] = \sigma^2\)

from which the Chebyshev inequality follows immediately.

— □

We consider three concepts which are useful in many situations.

Definition

A random variable \(X\) is *centered* iff \(E[X] = 0\).

\(X' = X - \mu\) is always centered.

Definition

A random variable \(X\) is *standardized* iff \(E[X] = 0\) and \(\text{Var} [X] = 1\).

\(X^* = \dfrac{X - \mu}{\sigma} = \dfrac{X'}{\sigma}\) is standardized

Definition

A pair \(\{X, Y\}\) of random variables is *uncorrelated* iff

\(E[XY] - E[X]E[Y] = 0\)

It is always possible to derive an uncorrelated pair as a function of a pair \(\{X, Y\}\), both of which have finite variances. Consider

\(U = (X^* + Y^*)\) \(V = (X^* - Y^*)\), where \(X^* = \dfrac{X - \mu_X}{\sigma_X}\), \(Y^* = \dfrac{Y - \mu_Y}{\sigma_Y}\)

Now \(E[U] = E[V] = 0\) and

\(E[UV] = E(X^* + Y^*) (X^* - Y^*)] = E[(X^*)^2] - E[(Y^*)^2] = 1 - 1 = 0\)

so the pair is uncorrelated.

Example \(\PageIndex{9}\) Determining an unvorrelated pair

We use the distribution for Examples Example 10 from "Mathematical Expectation: Simple Random Variables" and __Example__, for which

\(E[XY] - E[X]E[Y] \ne 0\)

jdemo1 jcalc Enter JOINT PROBABILITIES (as on the plane) P Enter row matrix of VALUES of X X Enter row matrix of VALUES of Y Y Use array operations on matrices X, Y, PX, PY, t, u, and P EX = total(t.*P) EX = 0.6420 EY = total(u.*P) EY = 0.0783 EXY = total(t.*u.*P) EXY = -0.1130 c = EXY - EX*EY c = -0.1633 % {X, Y} not uncorrelated

VX = total(t.^2.*P) - EX^2 VX = 3.3016 VY = total(u.^2.*P) - EY^2 VY = 3.6566 SX = sqrt(VX) SX = 1.8170 SY = sqrt(VY) SY = 1.9122 x = (t - EX)/SX; % Standardized random variables y = (u - EY)/SY; uu = x + y; % Uncorrelated random variables vv = x - y; EUV = total(uu.*vv.*P) % Check for uncorrelated condition EUV = 9.9755e-06 % Differs from zero because of roundoff