16.3: Problems on Conditional Independence, Given a Random Vector

Last updated
Save as PDF

Page ID: 10849

Paul Pfeiffer
Rice University

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Exercise \(\PageIndex{1}\)

The pair \(\{X, Y\}\) ci \(|H\). \(X\) ~ exponential (\(u/3\)), given \(H = u\); \(Y\) ~ exponential \((u/5)\), given \(H = u\); and \(H\) ~ uniform [1, 2]. Determine a general formula for \(P(X > r, Y > s)\), then evaluate for \(r = 3\), \(s = 10\).

Answer

\(P(X > r, Y > s|H = u) = e^{-ur/3} e^{us/5} = e^{-au}\), \(a = \dfrac{r}{3} + \dfrac{s}{5}\)

\(P(X > r, Y > s) = \int e^{-au} f_H (u)\ du = \int_{1}^{2} e^{-au}\ du = \dfrac{1}{a} [e^{-a} - e^{-2a}]\)

For \(r = 3\), \(s= 10\), \(a = 3\), \(P(X > 3, Y > 10) = \dfrac{1}{3} (e^{-3} - e^{-6}) = 0.0158\)

Exercise \(\PageIndex{2}\)

A small random sample of size \(n = 12\) is taken to determine the proportion of the student body which favors a proposal to expand the student Honor Council by adding two additional members “at large.” Prior information indicates that this proportion is about 0.6 = 3/5. From a Bayesian point of view, the population proportion is taken to be the value of a random variable \(H\). It seems reasonable to assume a prior distribution \(H\) ~ beta (4,3), giving a maximum of the density at (4 - 1)/(4 + 3 - 2) = 3/5. Seven of the twelve interviewed favor the proposition. What is the best mean-square estimate of the proportion, given this result? What is the conditional distribution of \(H\), given this result?

Answer

\(H\) ~ Beta (\(r, s\)), \(r = 4\), \(s = 3\), \(n = 12\), \(k = 7\)

\(E[H|S = k] = \dfrac{k + r}{n + r + s} = \dfrac{7 + 4}{12 + 4 + 3} = \dfrac{11}{19}\)

Exercise \(\PageIndex{3}\)

Let \(\{X_i: 1 \le i \le n\}\) be a random sample, given \(H\). Set \(W = (X_1, X_2, \cdot\cdot\cdot, X_n)\). Suppose \(X\) conditionally geometric \((u)\), given \(H = u\); i.e., suppose \(P(X = k|H = u) = u(1 - u)^k\) for all \(k \ge 0\). If \(H\) ~ uniform on [0, 1], determine the best mean square estimator for \(H\), given \(W\).

Answer

\(E[H|W = k] = \dfrac{E[HI_{\{k\}} (W)]}{E[I_{\{k\}} (W)} = \dfrac{E[HI_{\{k\}} (W)|H]}{E[I_{\{k\}} (W)|H}\)

\(= \dfrac{\int u P(W = k|H = u) f_H (u)\ du}{\int P(W = k|H = u) f_H (u)\ du}\), \(k = (k_1, k_2, \cdot\cdot\cdot, k_n)\)

\(P(W = k|H = u) = \prod_{i = 1}^{n} u (1 - u)^{k_i} = u^n (1 - u)^{k^*}\) \(k^* = \sum_{i = 1}^{n} k_i\)

\(E[H|W = k] = \dfrac{\int_{0}^{1} u^{n + 1} (1 - u)^{k^*}\ du}{\int_{0}^{1} u^{n} (1 - u)^{k^*}\ du} = \dfrac{\Gamma (n + 2) \Gamma (k^* + 1)}{\Gamma (n + 1 + k^* + 2)} \cdot \dfrac{\Gamma (n + k^* + 2)}{\Gamma (n + 1) \Gamma (k^* + 1)} =\)

\(\dfrac{n + 1}{n + k^* + 2}\)

Exercise \(\PageIndex{4}\)

Let \(\{X_i: 1 \le i \le n\}\) be a random sample, given \(H\). Set \(W = (X_1, X_2, \cdot\cdot\cdot, X_n)\). Suppose \(X\) conditionally Poisson \((u)\), given \(H = u\); i.e., suppose \(P(X = k|H = u) = e^{-u} u^k/k!\). If \(H\) ~ gamma \((m, \lambda)\), determine the best mean square estimator for \(H\), given \(W\).

Answer

\(E[H|W = k] = \dfrac{\int u P(W = k|H = u) f_H (u)\ du}{\int P(W = k|H = u) f_H (u)\ du}\)

\(P(W = k|H = u) = \prod_{i = 1}^{n} e^{-u} \dfrac{u^{k_i}}{k_i !} = e^{-nu} \dfrac{u^{k^*}}{A} k^* = \sum_{i = 1}^{n} k_i\)

\(f_H(u) = \dfrac{\lambda^m u^{m - 1} e^{-\lambda u}}{\Gamma (m)}\)

\(E[H|W = k] = \dfrac{\int_{0}^{\infty} u^{k^* + m} e^{-(\lambda + n)u}\ du}{\int_{0}^{\infty} u^{k^* + m - 1} e^{-(\lambda + n)u}\ du} = \dfrac{\Gamma (m + k^* + 1)}{(\lambda + n)^{k^* + m + 1}} \cdot \dfrac{(\lambda + n)^{k^* + m}}{\Gamma (m + k^*)} = \dfrac{m + k^*}{\lambda + n}\)

Exercise \(\PageIndex{5}\)

Suppose \(\{N, H\}\) is independent and \(\{N, Y\}\) ci \(|H\). Use properties of conditional expectation and conditional independence to show that

\(E[g(N) h(Y)|H] = E[g(N)] E[h(Y)|H]\) a.s.

Answer

\(E[g(N)h(H)|H] = E[g(N)|H] E[h(Y)|H]\) a.s. by (CI6) and

\(E[g(N)|H] = E[g(N)]\) a.s. by (CE5).

Exercise \(\PageIndex{6}\)

Consider the composite demand \(D\) introduced in the section on Random Sums in "Random Selecton"

\(D = \sum_{n = 0}^{\infty} I_{\{k\}} (N) X_n\) where \(X_n = \sum_{k = 0}^{n} Y_k\), \(Y_0 = 0\)

Suppose \(\{N, H\}\) is independent, \(\{N, Y_i\}\) ci \(|H\) for all \(i\), and \(E[Y_i|H] = e(H)\), invariant with \(i\). Show that \(E[D|H] = E[N]E[Y|H]\) a.s..

Answer

\(E[D|H] = \sum_{n = 1}^{\infty} E[I_{\{n\}} (N) X_n|H]\) a.s.

\(E[I_{\{n\}} (N) X_n |H] = \sum_{k = 1}^{n} E[I_{\{n\}} (N) Y_k|H] = \sum_{k = 1}^{n} P(N = n) E[Y|H] = P(N = n) nE[Y|H]\) a.s.

\(E[D|H] = \sum_{n = 1}^{\infty} n P(N = n) E[Y|H] = E[N] E[Y|H]\) a.s.

Exercise \(\PageIndex{7}\)

The transition matrix \(P\) for a homogeneous Markov chain is as follows (in m-file npr16_07.m):

\(P = \begin{bmatrix} 0.23 & 0.32 & 0.02 & 0.22 & 0.21 \\ 0.29 & 0.41 & 0.10 & 0.08 & 0.12 \\ 0.22 & 0.07 & 0.31 & 0.14 & 0.26 \\ 0.32 & 0.15 & 0.05 & 0.33 & 0.15 \\ 0.08 & 0.23 & 0.31 & 0.09 & 0.29 \end{bmatrix}\)

Obtain the absolute values of the eigenvalues, then consider increasing powers of \(P\) to observe the convergence to the long run distribution.
Take an arbitrary initial distribution \(p0\) (as a row matrix). The product \(p0 * p^k\) is the distribution for stage \(k\). Note what happens as \(k\) becomes large enough to give convergence to the long run transition matrix. Does the end result change with change of initial distribution \(p0\)?

Answer

ev = abs(eig(P))'
ev = 1.0000    0.0814    0.0814    0.3572    0.2429
a = ev(4).^[2 4 8 16 24]
a = 0.1276    0.0163    0.0003    0.0000    0.0000
% By P^16 the rows agree to four places
p0 = [0.5 0 0 0.3 0.2];     % An arbitrarily chosen p0
p4 = p0*P^4
p4 =    0.2297    0.2622    0.1444    0.1644    0.1992
p8 = p0*P^8
p8 =    0.2290    0.2611    0.1462    0.1638    0.2000
p16 = p0*P^16
p16 =   0.2289    0.2611    0.1462    0.1638    0.2000
p0a = [0 0 0 0 1];          % A second choice of p0
p16a = p0a*P^16
p16a =  0.2289    0.2611    0.1462    0.1638    0.2000

Exercise \(\PageIndex{8}\)

The transition matrix \(P\) for a homogeneous Markov chain is as follows (in m-file npr16_08.m):

\(P = \begin{bmatrix} 0.2 & 0.5 & 0.3 & 0 & 0 & 0 & 0 \\ 0.6 & 0.1 & 0.3 & 0 & 0 & 0 & 0 \\ 0.2 & 0.7 & 0.1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.6 & 0.4 & 0 & 0 \\ 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 \\ 0.1 & 0.3 & 0 & 0.2 & 0.1 & 0.1 & 0.2 \\ 0.1 & 0.2 & 0.1 & 0.2 & 0.2 & 0.2 & 0 \end{bmatrix}\)

Note that the chain has two subchains, with states {1, 2, 3} and {4, 5}. Draw a transition diagram to display the two separate chains. Can any state in one subchain be reached from any state in the other?
Check the convergence as in part (a) of Exercise 16.3.7. What happens to the state probabilities for states 6 and 7 in the long run? What does that signify for these states? Can these states be reached from any state in either of the subchains? How would you classify these states?

Answer: Increasing power \(p^n\) show the probability of being in states 6, 7 go to zero. These states cannot be reached from any of the other states.

Exercise \(\PageIndex{9}\)

The transition matrix \(P\) for a homogeneous Markov chain is as follows (in m-file npr16_09.m):

\(P = \begin{bmatrix} 0.1 & 0.2 & 0.1 & 0.3 & 0.2 & 0 & 0.1 \\ 0 & 0.6 & 0 & 0 & 0 & 0 & 0.4 \\ 0 & 0 & 0.2 & 0.5 & 0 & 0.3 & 0 \\ 0 & 0 & 0.6 & 0.1 & 0 & 0.3 & 0 \\ 0.2 & 0.2 & 0.1 & 0.2 & 0 & 0.1 & 0.2 \\ 0 & 0 & 0.2 & 0.7 & 0 & 0.1 & 0 \\ 0 & 0.5 & 0 & 0 & 0 & 0 & 0.5 \end{bmatrix}\)

Check the transition matrix \(P\) for convergence, as in part (a) of Exercise 16.3.7. How many steps does it take to reach convergence to four or more decimal places? Does this agree with the theoretical result?
Examine the long run transition matrix. Identify transient states.
The convergence does not make all rows the same. Note, however, that there are two subgroups of similar rows. Rearrange rows and columns in the long run Matrix so that identical rows are grouped. This suggests subchains. Rearrange the rows and columns in the transition matrix \(P\) and see that this gives a pattern similar to that for the matrix in Exercise 16.7.8. Raise the rearranged transition matrix to the power for convergence.

Answer

Examination of \(p^{16}\) suggests set {2, 7} and {3, 4, 6} of states form subchains. Rearrangement of \(P\) may be done as follows:

PA = P([2 7 3 4 6 1 5], [2 7 3 4 6 1 5])
PA =
    0.6000    0.4000         0         0         0         0         0
    0.5000    0.5000         0         0         0         0         0
         0         0    0.2000    0.5000    0.3000         0         0
         0         0    0.6000    0.1000    0.3000         0         0
         0         0    0.2000    0.7000    0.1000         0         0
    0.2000    0.1000    0.1000    0.3000         0    0.1000    0.2000
    0.2000    0.2000    0.1000    0.2000    0.1000    0.2000         0
PA16 = PA^16
PA16 =
    0.5556    0.4444         0         0         0         0         0
    0.5556    0.4444         0         0         0         0         0
         0         0    0.3571    0.3929    0.2500         0         0
         0         0    0.3571    0.3929    0.2500         0         0
         0         0    0.3571    0.3929    0.2500         0         0
    0.2455    0.1964    0.1993    0.2193    0.1395    0.0000    0.0000
    0.2713    0.2171    0.1827    0.2010    0.1279    0.0000    0.0000

It is clear that original states 1 and 5 are transient.

Exercise \(\PageIndex{10}\)

Use the m-procedure inventory1 (in m-file inventory1.m) to obtain the transition matrix for maximum stock \(M = 8\), reorder point \(m = 3\), and demand \(D\) ~ Poisson(4).

a. Suppose initial stock is six. What will the distribution for \(X_n\), \(n = 1, 3, 5\) (i.e., the stock at the end of periods 1, 3, 5, before restocking)?

b. What will the long run distribution be?

Answer

inventory1
Enter value M of maximum stock  8
Enter value m of reorder point  3
Enter row vector of demand values  0:20
Enter demand probabilities  ipoisson(4,0:20)
Result is in matrix P
p0 = [0 0 0 0 0 0 1 0 0];
p1 = p0*P
p1 =
  Columns 1 through 7
    0.2149    0.1563    0.1954    0.1954    0.1465    0.0733    0.0183
  Columns 8 through 9
         0         0
p3 = p0*P^3
p3 =
  Columns 1 through 7
    0.2494    0.1115    0.1258    0.1338    0.1331    0.1165    0.0812
  Columns 8 through 9
    0.0391    0.0096
p5 = p0*P^5
p5 =
  Columns 1 through 7
    0.2598    0.1124    0.1246    0.1311    0.1300    0.1142    0.0799
  Columns 8 through 9
    0.0386    0.0095
a = abs(eig(P))'
a =
  Columns 1 through 7
    1.0000    0.4427    0.1979    0.0284    0.0058    0.0005    0.0000
  Columns 8 through 9
    0.0000    0.0000
a(2)^16
ans =
   2.1759e-06       % Convergence to at least five decimals for P^16
pinf = p0*P^16      % Use arbitrary p0,  pinf approx p0*P^16
pinf =  Columns 1 through 7
    0.2622    0.1132    0.1251    0.1310    0.1292    0.1130    0.0789
  Columns 8 through 9
    0.0380    0.0093

Search

Text Color

Text Size

Margin Size

Font Type