10.1: Functions of a Random Variable
- Page ID
- 10876
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Introduction
Frequently, we observe a value of some random variable, but are really interested in a value derived from this by a function rule. If \(X\) is a random variable and \(g\) is a reasonable function (technically, a Borel function), then \(Z = g(X)\) is a new random variable which has the value \(g(t)\) for any \(\omega\) such that \(X(\omega) = t\). Thus \(Z(\omega) = g(X(\omega))\).
The problem; an approach
We consider, first, functions of a single random variable. A wide variety of functions are utilized in practice.
Example 10.1.1: A quality control problem
In a quality control check on a production line for ball bearings it may be easier to weigh the balls than measure the diameters. If we can assume true spherical shape and \(w\) is the weight, then diameter is \(kw^{1/3}\), where \(k\) is a factor depending upon the formula for the volume of a sphere, the units of measurement, and the density of the steel. Thus, if \(X\) is the weight of the sampled ball, the desired random variable is \(D = kX^{1/3}\).
Example 10.1.2: Price breaks
The cultural committee of a student organization has arranged a special deal for tickets to a concert. The agreement is that the organization will purchase ten tickets at $20 each (regardless of the number of individual buyers). Additional tickets are available according to the following schedule:
- 11-20, $18 each
- 21-30, $16 each
- 31-50, $15 each
- 51-100, $13 each
If the number of purchasers is a random variable \(X\), the total cost (in dollars) is a random quantity \(Z = g(X)\) described by
\(g(X) = 200 + 18 I_{M1} (X) (X - 10) + (16 - 18) I_{M2} (X) (X - 20)\)
\(+ (15 - 16) I_{M3} (X) (X - 30) + (13 - 15) I_{M4} (X) (X - 50)\)
where \(M1 = [10, \infty)\), \(M2 = [20, \infty)\), \(M3 = [30, \infty)\), \(M4 = [50, \infty)\)
The function rule is more complicated than in Example 10.1.1, but the essential problem is the same.
The problem
If \(X\) is a random variable, then \(Z = g(X)\) is a new random variable. Suppose we have the distribution for \(X\). How can we determine \(P(Z \in M)\), the probability \(Z\) takes a value in the set \(M\)?
An approach to a solution
We consider two equivalent approaches
To find \(P(X \in M)\).
- Mapping approach. Simply find the amount of probability mass mapped into the set \(M\) by the random variable \(X\).
- In the absolutely continuous case, calculate \(\int_{M} f_X\).
- In the discrete case, identify those values \(t_i\) of \(X\) which are in the set \(M\) and add the associated probabilities.
- Discrete alternative. Consider each value \(t_i\) of \(X\). Select those which meet the defining conditions for \(M\) and add the associated probabilities. This is the approach we use in the MATLAB calculations. Note that it is not necessary to describe geometrically the set \(M\); merely use the defining conditions.
To find \(P(g(X) \in M)\).
- Mapping approach. Determine the set \(N\) of all those t which are mapped into \(M\) by the function \(g\). Now if \(X(\omega) \in N\), then \(g(X(\omega)) \in M\), and if \(g(X(\omega)) \in M\), then \(X(\omega) \in N\). Hence
\(\{\omega: g(X(\omega)) \in M\} = \{\omega: X(\omega) \in N\}\)
Since these are the same event, they must have the same probability. Once \(N\) is identified, determine \(P(X \in N)\) in the usual manner (see part a, above).
- Discrete alternative. For each possible value \(t_i\) of \(X\), determine whether \(g(t_i)\) meets the defining condition for \(M\). Select those \(t_i\) which do and add the associated probabilities.
— □
Remark. The set \(N\) in the mapping approach is called the inverse image \(N = g^{-1} (M)\)
Example 10.1.3: A discrete example
Suppose \(X\) has values -2, 0, 1, 3, 6, with respective probabilities 0.2, 0.1, 0.2, 0.3 0.2.
Consider \(Z = g(X) = (X + 1) (X - 4)\). Determine \(P(Z > 0)\).
Solution
First solution. The mapping approach
\(g(t) = (t + 1) (t - 4)\). \(N = \{t: g(t) > 0\}\) is the set of points to the left of –1 or to the right of 4. The \(X\)-values –2 and 6 lie in this set. Hence
\(P(g(X) > 0) = P(X = -2) + P(X = 6) = 0.2 + 0.2 = 0.4\)
Second solution. The discrete alternative
X = | -2 | 0 | 1 | 3 | 6 |
P X = | 0.2 | 0.1 | 0.2 | 0.3 | 0.2 |
Z = | 6 | -4 | -6 | -4 | 14 |
Z > 0 | 1 | 0 | 0 | 0 | 1 |
Picking out and adding the indicated probabilities, we have
\(P(Z > 0) = 0.2 + 0.2 = 0.4\)
In this case (and often for “hand calculations”) the mapping approach requires less calculation. However, for MATLAB calculations (as we show below), the discrete alternative is more readily implemented.
Example 10.1.4. An absolutely continuous example
Suppose \(X\) ~ uniform [–3,7]. Then \(f_X(t) = 0.1\), \(-3 \le t \le 7\) (and zero elsewhere). Let
\(Z = g(X) = (X + 1) (X - 4)\)
Determine \(P(Z > 0)\).
Solution
First we determine \(N = \{t: g(t) > 0\}\). As in Example 10.1.3, \(g(t) = (t+ 1) (t - 4) > 0\) for \(t < -1\) or \(t > 4)\). Because of the uniform distribution, the integral of the density over any subinterval of \(\{X, Y\}\) is 0.1 times the length of that subinterval. Thus, the desired probability is
\(P(g(X) > 0) = 0.1 [(-1 - (-3)) + (7 - 4)] = 0.5\)
We consider, next, some important examples.
Example 10.1.5: The normal distribution and standardized normal distribution
To show that if \(X\) ~ \(N(\mu, \sigma^2)\) then
\(Z = g(X) = \dfrac{X - \mu}{\sigma} ~ N(0, 1)\)
VERIFICATION
We wish to show the denity function for \(Z\) is
\(\varphi (t) = \dfrac{1}{\sqrt{2\pi}} e^{-t^2/2}\)
Now
\(g(t) = \dfrac{t - \mu} {\sigma} \le v\) iff \(t \le \sigma v + \mu\)
Hence, for given \(M = (-\infty, v]\) the inverse image is \(N = (-\infty, \sigma v + \mu]\), so that
\(F_Z (v) = P(Z \le v) = P(Z \in M) = P(X \in N) = P(X \le \sigma v + \mu) = F_X (\sigma v + \mu)\)
Since the density is the derivative of the distribution function,
\(f_Z(v) = F_{Z}^{'} (v) = F_{X}^{'} (v) = F_{X}^{'} (\sigma v + \mu) \sigma = \sigma f_X (\sigma v + \mu)\)
Thus
\(f_Z (v) = \dfrac{\sigma}{\sigma \sqrt{2\pi}} \text{exp} [-\dfrac{1}{2} (\dfrac{\sigma v + \mu - \mu}{\sigma})^2] = \dfrac{1}{\sqrt{2\pi}} e^{-v^2/2} = \varphi(v)\)
We conclude that \(Z\) ~ \(N(0, 1)\).
Example \(\PageIndex{1}\)
Suppose \(X\) has distribution function \(F_X\). If it is absolutely continuous, the corresponding density is \(f_X\). Consider \(Z = aX + b\). Here \(g(t) = at + b\), an affine function (linear plus a constant). Determine the distribution function for \(Z\) (and the density in the absolutely continuous case).
Solution
\(F_Z (v) = P(Z \le v) = P(aX + b \le v)\)
There are two cases
- \(a\) > 0:
\(F_Z (v) = P(X \le \dfrac{v - b}{a}) = F_X (\dfrac{v - b}{a})\)
- \(a\) < 0
\(F_Z (v) = P(X \ge \dfrac{v - b}{a}) = P(X > \dfrac{v - b}{a}) + P(X = \dfrac{v - b}{a})\)
So that
\(F_Z (v) = 1 - F_X (\dfrac{v - b}{a}) + P(X = \dfrac{v - b}{a})\)
For the absolutely continuous case, \(P(X = \dfrac{v - b}{a}) = 0\), and by differentiation
- for \(a > 0\) \(f_Z (v) = \dfrac{1}{a} f_X (\dfrac{v - b}{a})\)
- for \(a < 0\) \(f_Z (v) = -\dfrac{1}{a} f_X (\dfrac{v - b}{a})\)
Since for \(a < 0\), \(-a = |a|\), the two cases may be combined into one formula.
\(f_Z (v) = \dfrac{1}{|a|} f_X (\dfrac{v-b}{a})\)
Example 10.1.7: Completion of normal and standardized normal relationship
Suppose \(Z\) ~ \(N(0, 1)\). show that \(X = \sigma Z + \mu \) (\(\sigma > 0\)) is \(N(\mu, \sigma^2)\).
VERIFICATION
Use of the result of Example 10.1.6 on affine functions shows that
\(f_{X} (t) = \dfrac{1}{\sigma} \varphi (\dfrac{t - \mu}{\sigma}) = \dfrac{1}{\sigma \sqrt{2\pi}} \text{exp} [-\dfrac{1}{2} (\dfrac{t - \mu}{\sigma})^2]\)
Example 10.1.8: Fractional power of a nonnegative random variable
Suppose \(X \ge 0\) and \(Z = g(X) = X^{1/a}\) for \(a > 1\). Since for \(t \ge 0\), \(t^{1/a}\) is increasing, we have \(0 \le t^{1/a} \le v\) iff \(0 \le t \le v^{a}\). Thus
\(F_Z (v) = P(Z \le v) = P(X \le v^{a}) = F_X (v^{a})\)
In the absolutely continuous case
\(f_Z (v) = F_{Z}^{'} (v) = f_X (v^{a}) a v^{a - 1}\)
Example 10.1.9: Fractional power of an exponentially distributed random variable
Suppose \(X\) ~ exponential (\(\lambda\)). Then \(Z = X^{1/a}\) ~ Weibull \((a, \lambda, 0)\).
According to the result of Example 10.1.8,
\(F_Z(t) = F_X (t^{a}) = 1- e^{-\lambda t^{a}}\)
which is the distribution function for \(Z\) ~ Weibull \((a, \lambda, 0)\).
Example 10.1.10: A simple approximation as a function of X
If \(X\) is a random variable, a simple function approximation may be constructed (see Distribution Approximations). We limit our discussion to the bounded case, in which the range of \(X\) is limited to a bounded interval \(I = [a, b]\). Suppose \(I\) is partitioned into \(n\) subintervals by points \(t_i\), \(1 \le i \le n - 1\), with \(a = t_0\) and \(b = t_n\). Let \(M_i = [t_{i - 1}, t_i)\) be the \(i\)th subinterval, \(1 \le i \le n- 1\) and \(M_n = [t_{n -1}, t_n]\). Let \(E_i = X^{-1} (M_i)\) be the set of points mapped into \(M_i\) by \(X\). Then the \(E_i\) form a partition of the basic space \(\Omega\). For the given subdivision, we form a simple random variable \(X_s\) as follows. In each subinterval, pick a point \(s_i, t_{i - 1} \le s_i < t_i\). The simple random variable
\(X_s = \sum_{i = 1}^{n} s_i I_{E_i}\)
approximates \(X\) to within the length of the largest subinterval \(M_i\). Now \(I_{E_i} = I_{M_i} (X)\), since \(I_{E_i} (\omega) = 1\) iff \(X(\omega) \in M_i\) iff \(I_{M_i} (X(\omega)) = 1\). We may thus write
\(X_s = \sum_{i = 1}^{n} s_i I_{M_i} (X)\), a function of \(X\)
Use of MATLAB on simple random variables
For simple random variables, we use the discrete alternative approach, since this may be implemented easily with MATLAB. Suppose the distribution for \(X\) is expressed in the row vectors \(X\) and \(PX\).
- We perform array operations on vector \(X\) to obtain
\(G = [g(t_1) g(t_2) \cdot\cdot\cdot g(t_n)]\)
- We use relational and logical operations on \(G\) to obtain a matrix \(M\) which has ones for those \(t_i\) (values of \(X\)) such that \(g(t_i)\) satisfies the desired condition (and zeros elsewhere).
- The zero-one matrix \(M\) is used to select the the corresponding \(p_i = P(X = t_i)\) and sum them by the taking the dot product of \(M\) and \(PX\).
Example 10.1.11: Basic calculations for a function of a simple random variable
X = -5:10; % Values of X PX = ibinom(15,0.6,0:15); % Probabilities for X G = (X + 6).*(X - 1).*(X - 8); % Array operations on X matrix to get G = g(X) M = (G > - 100)&(G < 130); % Relational and logical operations on G PM = M*PX' % Sum of probabilities for selected values PM = 0.4800 disp([X;G;M;PX]') % Display of various matrices (as columns) -5.0000 78.0000 1.0000 0.0000 -4.0000 120.0000 1.0000 0.0000 -3.0000 132.0000 0 0.0003 -2.0000 120.0000 1.0000 0.0016 -1.0000 90.0000 1.0000 0.0074 0 48.0000 1.0000 0.0245 1.0000 0 1.0000 0.0612 2.0000 -48.0000 1.0000 0.1181 3.0000 -90.0000 1.0000 0.1771 4.0000 -120.0000 0 0.2066 5.0000 -132.0000 0 0.1859 6.0000 -120.0000 0 0.1268 7.0000 -78.0000 1.0000 0.0634 8.0000 0 1.0000 0.0219 9.0000 120.0000 1.0000 0.0047 10.0000 288.0000 0 0.0005 [Z,PZ] = csort(G,PX); % Sorting and consolidating to obtain disp([Z;PZ]') % the distribution for Z = g(X) -132.0000 0.1859 -120.0000 0.3334 -90.0000 0.1771 -78.0000 0.0634 -48.0000 0.1181 0 0.0832 48.0000 0.0245 78.0000 0.0000 90.0000 0.0074 120.0000 0.0064 132.0000 0.0003 288.0000 0.0005 P1 = (G<-120)*PX ' % Further calculation using G, PX P1 = 0.1859 p1 = (Z<-120)*PZ' % Alternate using Z, PZ p1 = 0.1859
Example 10.1.12
\(X = 10 I_A + 18 I_B + 10 I_C\) with \(\{A, B, C\}\) independent and \(P =\) [0.60.30.5].
We calculate the distribution for \(X\), then determine the distribution for
\(Z = X^{1/2} - X + 50\)
c = [10 18 10 0]; pm = minprob(0.1*[6 3 5]); canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.1400 10.0000 0.3500 18.0000 0.0600 20.0000 0.2100 28.0000 0.1500 38.0000 0.0900 G = sqrt(X) - X + 50; % Formation of G matrix [Z,PZ] = csort(G,PX); % Sorts distinct values of g(X) disp([Z;PZ]') % consolidates probabilities 18.1644 0.0900 27.2915 0.1500 34.4721 0.2100 36.2426 0.0600 43.1623 0.3500 50.0000 0.1400 M = (Z < 20)|(Z >= 40) % Direct use of Z distribution M = 1 0 0 0 1 1 PZM = M*PZ' PZM = 0.5800
Remark. Note that with the m-function csort, we may name the output as desired.
Example 10.1.13: Continuation of example 10.1.12, above.
H = 2*X.^2 - 3*X + 1; [W,PW] = csort(H,PX) W = 1 171 595 741 1485 2775 PW = 0.1400 0.3500 0.0600 0.2100 0.1500 0.0900
Example 10.1.14: A discrete approximation
Suppose \(X\) has density function \(f_X(t) = \dfrac{1}{2} (3t^2 + 2t)\) for \(0 \le t \le 1\). Then \(F_X (t) = \dfrac{1}{2} (t^3 + t^2)\). Let \(Z = X^{1/2}\). We may use the approximation m-procedure tappr to obtain an approximate discrete distribution. Then we work with the approximating random variable as a simple random variable. Suppose we want \(P(Z \le 0.8)\). Now \(Z \le 0.8\) iff \(X \le 0.8^2 = 0.64\). The desired probability may be calculated to be
\(P(Z \le 0.8) = F_X (0.64) = (0.64^3 + 0.64^2)/2 = 0.3359\)
Using the approximation procedure, we have
tappr Enter matrix [a b] of x-range endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t (3*t.^2 + 2*t)/2 Use row matrices X and PX as in the simple case G = X.^(1/2); M = G <= 0.8; PM = M*PX' PM = 0.3359 % Agrees quite closely with the theoretical