4.3.1: Multinomial Distributions - Optional Material
- Page ID
- 45471
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Define multinomial random variable
- Develop the multinomial distribution
Counting Outcomes of Random Experiments
Consider the game of chess; a game can end in one of three ways: win, lose, or draw. For a pair of grandmasters, we may have an empirical estimation for the probability of each outcome based on the outcomes of previous games. If we knew they were going to be playing a set number of games \(n\) soon, we might be interested in the probability that the one player wins \(n_1\) times, the second player wins \(n_2\) times, and they draw \(n_3\) times. Can we develop a random variable to handle such a task? The answer is yes; the multinomial random variable is a generalization of the binomial random variable. In binomial random variables, we counted the number of successful trials, which, given that we had a fixed number of trials, also set the number of failures. With three options, we must maintain counts for two of the outcomes. So, our random variable returns a coordinate pair of values.
Suppose that Magnus Carlsen and Fabiano Caruana (the two top grandmasters in June \(2024\)) are set to play \(12\) games against each other in a friendly tournament. For each game, we estimate that Magnus has a \(30\%\) chance to win while Fabiano has a \(25\%\) chance to win. This leaves a \(45\%\) chance of a draw. What is the probability that of the \(12\) games, Magnus wins \(5\) games, Fabiano wins \(3\) games, and they draw on \(4\) games? Given the grandmaster status of these players, we assume that the results of previous games do not affect performances in current and future games.
We set some notation for the problem. \(n\) = \(12\) because \(12\) games are to be played, \(n_1\) = \(5\) (number to be won by Magnus), \(n_2\) = \(3\) (number to be won by Fabiano), \(n_3\) = \(4\) (number of draws), \(p_1\) = \(0.30\) (probability that Magnus wins a game), \(p_2\) = \(0.25\) (probability that Fabiano wins a game), \(p_3\) = \(0.45\) (probability of a draw). As mentioned above, the multinomial variable \(X\) that counts the number of wins of each player in \(12\) games takes on coordinate pairs of values, \(\(n_1,n_2\),\) and we are interested in the probability that \(n_1=5\) and \(n_2=3,\) \(P\left(X=(5,3)\right).\)
With \(12\) games and \(3\) possible outcomes for each game, considering every possible sequence of \(12\) outcomes is out of the question. We would have \(3^{12}\) \(=531,441\) sequences to consider. Hopefully, we can build on our understanding of the binomial random variable. Recall that the probability of a particular sequence of outcomes of all the trials depended on the total number of successes and failures. The order in which they occurred did not matter. This probability was \(p^{x_j}q^{n-x_j}.\) We then counted the number of ways that a number of successes and failures could happen, \(\sideset{_n}{_{x_j}}C,\) which led to our probability computation of \(\sideset{_n}{_{x_j}}\cdot p^{x_j}q^{n-x_j}.\)
A similar line of reasoning will help us develop a probability distribution function for multinomial variables. Just as with binomial random variables, the probability of a particular sequence of outcomes depends on the values of \(n_1,\) \(n_2,\) and \(n_3\) \(=n-n_1-n_2.\) We arrive at the probability computation \(p_1^{n_1}p_2^{n_2}p_3^{n_3}.\) The only issue remains to count the number of such sequences that have given \(n_1\) and \(n_2\) values. Here, we refer to the optional material in chapter \(3:\) distinguishable permutations. We have three outcomes that we are assigning to particular trials, and the order in which a trial is assigned to one of these outcomes does not matter. We can, therefore, count the number of sequences that have given \(n_1\) and \(n_2\) values with this computation: \(\frac {n!}{n_1!n_2!n_3!}\). We conclude that the probability distribution function for a multinomial random variable \(X\) with \(3\) outcomes and \(n\) trials.\[ \begin{align*}P\left(X=(n_1,n_2)\right) &=\dfrac{n!}{n_1!n_2!n_3!} p_1^{n_1}p_2^{n_2}p_3^{n_3}\\&=\dfrac{n!}{n_1!n_2!(n-n_1-n_2)!} p_1^{n_1}p_2^{n_2}p_3^{n-n_1-n_2}\end{align*}\]We can answer our original question in the context of chess: \(P\left(X=(5,3)\right).\)\[ \begin{align*}P\left(X=(5,3)\right) &=\dfrac{12!}{5!3!4!} (0.30)^{5}(0.25)^{3}(0.45)^{4}\\&\approx27,720\cdot0.00243\cdot0.01563\cdot0.04101\\&\approx4.3159\% \end{align*}\]
Suppose that Magnus and Fabiano decide that \(12\) games are too many and reduce it to just \(4\) games. Produce the probability distribution for the multinomial random variable \(X\) that counts each of their wins.
- Answer
-
Since there are three outcomes that we are interested in rather than just two with binomial random variables, we have many more options to consider, \(15\) options in fact.
Table \(\PageIndex{1}\): Probability distribution for the random variable \(X\)
\(X=(n_1,n_2)\) \(P\left(X=(n_1,n_2)\right)\) \(X=(n_1,n_2)\) \(P\left(X=(n_1,n_2)\right)\) \((0,0)\) \(\dfrac{4!}{0!0!4!} (0.30)^{0}(0.25)^{0}(0.45)^{4}\approx4.10\%\) \((1,3)\) \(\dfrac{4!}{1!3!0!} (0.30)^{1}(0.25)^{3}(0.45)^{0}\approx1.88\%\) \((0,1)\) \(\dfrac{4!}{0!1!3!} (0.30)^{0}(0.25)^{1}(0.45)^{3}\approx9.11\%\) \((2,0)\) \(\dfrac{4!}{2!0!2!} (0.30)^{2}(0.25)^{0}(0.45)^{2}\approx10.94\%\) \((0,2)\) \(\dfrac{4!}{0!2!2!} (0.30)^{0}(0.25)^{2}(0.45)^{2}\approx7.59\%\) \((2,1)\) \(\dfrac{4!}{2!1!1!} (0.30)^{2}(0.25)^{1}(0.45)^{1}\approx12.15\%\) \((0,3)\) \(\dfrac{4!}{0!3!1!} (0.30)^{0}(0.25)^{3}(0.45)^{1}\approx2.81\%\) \((2,2)\) \(\dfrac{4!}{2!2!0!} (0.30)^{2}(0.25)^{2}(0.45)^{0}\approx3.38\%\) \((0,4)\) \(\dfrac{4!}{0!4!0!} (0.30)^{0}(0.25)^{4}(0.45)^{0}\approx0.39\%\) \((3,0)\) \(\dfrac{4!}{3!0!1!} (0.30)^{3}(0.25)^{0}(0.45)^{1}\approx4.86\%\) \((1,0)\) \(\dfrac{4!}{1!0!3!} (0.30)^{1}(0.25)^{0}(0.45)^{3}\approx10.94\%\) \((3,1)\) \(\dfrac{4!}{3!1!0!} (0.30)^{3}(0.25)^{1}(0.45)^{0}\approx2.70\%\) \((1,1)\) \(\dfrac{4!}{1!1!2!} (0.30)^{1}(0.25)^{1}(0.45)^{2}\approx18.23\%\) \((4,0)\) \(\dfrac{4!}{4!0!0!} (0.30)^{4}(0.25)^{0}(0.45)^{0}\approx0.81\%\) \((1,2)\) \(\dfrac{4!}{1!2!1!} (0.30)^{1}(0.25)^{2}(0.45)^{1}\approx10.13\%\)
Multinomial random variables can extend to counting many more outcomes. We conclude this section by generalizing the multinomial random variable where we count \(k\) outcomes. The probability distribution function for a multinomial random variable \(X\) with \(k\) outcomes and \(n\) trials is given below.\[ P\left(X=(n_1,n_2,\ldots,n_{k-1})\right) =\dfrac{n!}{n_1!n_2!\ldots n_k!} p_1^{n_1}p_2^{n_2}\dots p_k^{n_k}\nonumber\]
Note that the binomial distribution is a special case of the multinomial distribution when \(k = 2.\)