6.1: Functions of Normal Random Variables
- Page ID
- 13635
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)In addition to considering the probability distributions of random variables simultaneously using joint distribution functions, there is also occasion to consider the probability distribution of functions applied to random variables. In this section we consider the special case of applying functions to normally distributed random variables, which will be very important in the following section. We begin by deriving the probability distribution of the square of a standard normal random variable.
Before we jump into the example, note that one approach to finding the probability distribution of a function of a random variable relies on the relationship between the pdf and cdf for a continuous random variable:
$$\frac{d}{dx} [F(x)] = f(x) \qquad\text{''derivative of cdf = pdf"}\notag$$
It is often easier to find the cdf of a function of a continuous random variable, and then use the above relationship to derive the pdf.
Example \(\PageIndex{1}\)
Let \(Z\) be a standard normal random variable, i.e., \(Z\sim N(0,1)\). We find the pdf of \(Y=Z^2\).
Let \(\Phi\) denote the cdf of \(Z\), i.e., \(\Phi(z) = P(Z\leq z) = F_Z(z)\). We first find the cdf of \(X=Z^2\) in terms of \(\Phi\) (recall that there is no closed form expression for \(\Phi\)):
\begin{align*}
F_Y(y) = P(Y\leq y) &= P(Z^2\leq y)\\
&= P\left(-\sqrt{y} \leq Z \leq \sqrt{y}\right), \text{ for } y\geq0\\
&= \Phi\left(\sqrt{y}) - \Phi(-\sqrt{y}\right)
\end{align*}
Note that if \(y<0\), then \(F_Y(y) = 0\), since it is not possible for \(Y=Z^2\) to be negative. In other words, the possible values of \(Y=Z^2\) are \(y\geq 0\).
Next, we take the derivative of the cdf of \(Y\) to find its pdf. Before doing so, we note that if \(\Phi\) is the cdf for \(Z\), then its derivative is the pdf for \(Z\), which is denoted \(\varphi\). Since \(Z\) is a standard normal random variable, we know that
$$\varphi(z) = f_Z(z) = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}, \quad\text{for}\ z\in\mathbb{R}.\notag$$
Using this, we now find the pdf of \(Y\):
\begin{align*}
f_Y(y) = \frac{d}{dy}[F_Y(y)] &= \frac{d}{dy}\left[\Phi\left(\sqrt{y}\right)-\Phi\left(-\sqrt{y}\right)\right]\\
&=\frac{d}{dy}\left[\Phi\left(\sqrt{y}\right)\right] - \frac{d}{dy}\left[\Phi\left(-\sqrt{y}\right)\right]\\
&= \varphi\left(\sqrt{y}\right)\cdot\frac{1}{2\sqrt{y}} + \varphi\left(-\sqrt{y}\right)\cdot\frac{1}{2\sqrt{y}}\\
&= \frac{1}{\sqrt{2\pi}}e^{-(\sqrt{y})^2/2}\cdot\frac{1}{2\sqrt{y}} + \frac{1}{\sqrt{2\pi}}e^{-(-\sqrt{y})^2/2}\cdot\frac{1}{2\sqrt{y}} \\
&= \frac{1}{\sqrt{2\pi}}e^{-y/2} \cdot \frac{1}{\sqrt{y}}.
\end{align*}
In summary, if \(Y= Z^2\), where \(Z\sim N(0,1)\), then the pdf for \(Y\) is given by
$$f_Y(y) = \frac{y^{-1/2}}{\sqrt{2\pi}}e^{-y/2}, \text{ for } y\geq 0.\notag$$
Note that the pdf for \(Y\) is a gamma pdf with \(\alpha = \lambda = \frac{1}{2}\). This is also referred to as the chi-square distribution, denoted \(\chi^2\). See below for a video walkthrough of this example.
There is another approach to finding the probability distribution of functions of random variables, which involves moment-generating functions, which we now define.
Definition \(\PageIndex{1}\)
The moment-generating function (mgf) of a random variable \(X\) is given by
$$M_X(t) = E[e^{tX}], \quad\text{for}\ t\in\mathbb{R}.\notag$$
The mgf of a random variable has many theoretical properties that are very useful in the study of probability theory. One of those properties is the fact that when the derivative of the mgf is evaluated for \(t=0\), the result is equal to the expected value of the random variable:
$$\frac{d}{dt} [M_X(t)]_{t=0} = E[X]\notag$$
This result can be extended to higher order derivatives producing higher order moments, which we will not go into. Instead, we state the following properties that are useful in the context of determining the distribution of a function of random variables. The first result below indicates that mgf's are unique in the sense that if two random variables have the same mgf, then they necessarily have the same probability distribution. The next two properties provide ways of manipulating mgf's in order to find the mgf of a function of a random variable.
Theorem \(\PageIndex{1}\)
The mgf \(M_X(t)\) of random variable \(X\) uniquely determines the probability distribution of \(X\). In other words, if random variables \(X\) and \(Y\) have the same mgf, \(M_X(t) = M_Y(t)\), then \(X\) and \(Y\) have the same probability distribution.
Theorem \(\PageIndex{2}\)
Let \(X\) be a random variable with mgf \(M_X(t)\), and let \(a,b\) be constants. If random variable \(Y= aX + b\), then the mgf of \(Y\) is given by
$$M_Y(t) = e^{bt}M_X(at).\notag$$
Theorem \(\PageIndex{3}\)
If \(X_1, \ldots, X_n\) are independent random variables with mgf's \(M_{X_1}(t), \ldots, M_{X_n}(t)\), respectively, then the mgf of random variable \(Y = X_1 + \cdots + X_n\) is given by
$$M_Y(t) = M_{X_1}(t) \cdots M_{X_n}(t).\notag$$
Theorem 6.1.1 states that mgf's are unique, and Theorems 6.1.2 & 6.1.3 combined provide a process for finding the mgf of a linear combination of random variables. All three theorems provide a Moment-Generating-Function technique for finding the probability distribution of a function of random variable(s), which we demonstrate with the following examples involving the normal distribution.
Example \(\PageIndex{2}\)
Suppose that \(X\sim N(\mu,\sigma)\). It can be shown that the mgf of \(X\) is given by
$$M_X(t) = e^{\mu t + (\sigma^2 t^2/2)}, \quad\text{for}\ t\in\mathbb{R}.\notag$$
Using this mgf formula, we can show that \(\displaystyle{Z = \frac{X-\mu}{\sigma}}\) has the standard normal distribution.
- Note that if \(Z\sim N(0,1)\), then the mgf is \(M_Z(t) = e^{0t+(1^2t^2/2)} = e^{t^2/2}\)
- Also note that \(\displaystyle{\frac{X-\mu}{\sigma} = \left(\frac{1}{\sigma}\right)X+\left(\frac{-\mu}{\sigma}\right)}\), so by Theorem 6.1.2,
$$M_{\frac{1}{\sigma}X-\frac{\mu}{\sigma}}(t) = e^{-\frac{\mu t}{\sigma}}M_X\left(\frac{t}{\sigma}\right) = e^{t^2/2}.\notag$$
Thus, we have shown that \(Z\) and \(\displaystyle{\frac{X-\mu}{\sigma}}\) have the same mgf, which by Theorem 6.1.1, says that they have the same distribution.
Now suppose \(X_1, \ldots, X_n\) are each independent normally distributed with means \(\mu_1, \ldots, \mu_n\) and sd's \(\sigma_1, \ldots, \sigma_n\), respectively.
Let's find the probability distribution of the sum \(Y = a_1X_1 + \cdots + a_nX_n\) (\(a_1,\ldots,a_n\) constants) using the mgf technique:
By Theorem 6.1.2, we have
$$M_{a_iX_i}(t) = M_{X_i}(a_it) = e^{\mu_ia_it+(\sigma_i)^2(a_i)^2t^2/2},\quad\text{for}\ i=1, \cdots, n,\notag$$
and then by Theorem 6.1.3 we get the following:
\begin{align*}
M_Y(t) &= M_{a_1X_1}(t)\cdot M_{a_2X_2}(t)\cdots M_{a_nX_n}(t)\\
&= e^{\mu_1a_1t+\sigma_1^2a_1^2t^2/2}e^{\mu_2a_2t+\sigma_2^2a_2^2t^2/2}\cdots e^{\mu_na_nt+\sigma_n^2a_n^2t^2/2}\\
&= e^{(\mu_1a_1+\mu_2a_2+\cdots+\mu_na_n)t+(\sigma_1^2a_1^2+\sigma_2^2a_2^2+\cdots+\sigma_n^2a_n^2)\frac{t^2}{2}}\\
\Rightarrow M_Y(t) &= e^{\mu_yt+\sigma_y^2t^2/2}
\end{align*}
Thus, by Theorem 6.1.1, \(Y\sim N(\mu_y,\sigma_y)\).
The second part of Example 6.1.2 proved the following result, which we will use in the next section.
Sums of Independent Normal Random Variables
If \(X_1,\ldots,X_n\) are mutually independent normal random variables with means \(\mu_1, \ldots, \mu_n\) and standard deviations \(\sigma_1, \ldots, \sigma_n\), respectively, then the linear combination
$$Y = a_1X_1 + \cdots + c_nX_n = \sum^n_{i=1} a_iX_i,\notag$$
is normally distributed with the following mean and variance:
$$\mu_Y = a_1\mu_1 + \cdots + a_n\mu_n = \sum^n_{i=1}a_i\mu_i \qquad \sigma^2_Y = a_1^2\sigma^2_1 + \cdots + a_n\sigma^2_n = \sum^n_{i=1}a^2_i\sigma^2_i\notag$$