4.3: Binomial Distributions
- Define the binomial random variable
- Construct binomial distributions
- Develop and use the probability distribution function for binomial random variables
- Provide and use alternative formulas for expected value and variance of binomial random variables
- Assess the necessity of independent trials
Binomial Random Variable
Suppose we had a weighted coin which only came up heads one-sixth of the time. If we flipped this coin \(10\) times, what is the probability that exactly \(2\) of those flips would be heads? In order to answer this question, we would need to recognize that each of the coin flips are independent and that the probability of getting heads is the same each time. The answer turns out to be about \(29\%;\) we will demonstrate how to compute this soon.
Now suppose we were rolling a fair die \(10\) times. What is the probability that exactly \(2\) of those rolls would be six? The astute reader may notice that the answer is the same: \(29\%.\) Why is that? Well, we either roll a six or we don't. Rolling a six is analogous to flipping heads and not rolling a six is analogous to flipping tails. Since each trial is independent and the probability of obtaining the outcome of interest is \(\frac{1}{6},\) the two scenarios are the same from the perspective of probability.
In fact, we can be much more general. Suppose in a population consisting of millions of people, exactly one in six of them support some political candidate. If we randomly and independently selected \(10\) people, what is the probability that \(2\) of them would be supporters of the candidate? Suppose a factory produces light bulbs and one-sixth of them are dysfunctional. If an inspector were to randomly and independently select \(10\) of them, what is the probability that exactly \(2\) of them would be dysfunctional? Suppose an archer hits the bulls-eye with probability \(\frac{1}{6}\) every time she shoots. If she takes \(10\) shots, what is the probability that she gets exactly \(2\) bulls-eyes? The answers to all of these questions are the same: \(29\%.\) It is clear that some of the details of the situations are irrelevant; all that matters is that a trial is repeated \(10\) times, each trial is independent, and the probability of the outcome of interest occurring is \(\frac{1}{6}\) each time. These sorts of situations are the object of our discussion: binomial random variables.
Binomial distributions are the probability distributions for a particular type of discrete random variable: the binomial random variable. With binomial random variables, we are considering a single random experiment repeated, identically and independently, a fixed number of times. We call each repetition a trial and indicate the number of trials with \(n.\) As the adjective "binomial" indicates, we group the outcomes into two categories: successes and failures. The probability of a success on any given trial is denoted \(p,\) while the probability of a failure on any given trial is denoted \(q.\) Note that since we have only two categories covering the entire sample space, \(q=1-p.\) We define the binomial random variable \(X\) as the number of successes throughout all \(n\) trials. Every trial may fail, in which case, \(X=0.\) On the other hand, every trial may be a success, in which case, \(X=n.\) In most cases, some trials will succeed while others fail. As such, \(X\) takes on any number in the set \(\{0,\)\(1,\)\(2,\)\(3,\)\(\ldots,\)\(n\}.\) In the examples discussed above, \(n=10,\) \(p=\frac{1}{6},\) \(q=\frac{5}{6},\) and we were asking what is \(P(X=2).\)
Consider an example to help solidify these ideas. In a previous text exercise , we considered tossing a fair die three times and determined the probability of getting one or more throws landing with one face up. We can understand this situation as a binomial random variable. Our underlying random experiment is rolling a fair die. We fix the number of trials to \(3.\) The trials are identical because we are similarly rolling the same die each time. The trials are independent because the outcomes of previous rolls do not affect current or future rolls. Since we are interested in rolling ones, we define that as a success. Rolling any other value (\(2,\) \(3,\) \(4,\) \(5,\) or \(6\)) constitutes a failure. We can easily compute the probabilities of success and failure on any individual trial; \(p=\frac{1}{6}\) and \(q=\frac{5}{6}.\) We define our binomial random variable \(X\) to be the number of ones rolled in \(3\) tosses of a fair die. The possible values for \(X\) are \(0,\) \(1,\) \(2,\) and \(3.\) Recall that our interest in random variables lies in their probability distributions. We will now address constructing a binomial random variable's probability distribution.
Probability Distributions of Binomial Random Variables
We first build our intuition by constructing the probability distribution of our binomial random variable \(X:\) the number of ones rolled in \(3\) tosses of a fair die. When determining the probability for a particular value of a random variable, we generally considered all of the outcomes in the sample space and proceeded from there. Considering all three trials based on the values landing up would result in \(6^3=216\) different outcomes. We can simplify our analysis by considering all three trials based on successes and failures; that is, rolling a one is considered a success and rolling anything other than a one is considered a failure. In this case, we only have \(2^3=8\) considerations. We shall use \(\small\color{blue}\text{S}\) to indicate a trial with success and \(\small\color{red}\text{F}\) to indicate a trial with a failure, and represent the possibility of a successful trial followed by failures on the second and third trials as \(\small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}\). This procedure is illustrated for several, but not all, possible outcomes in the figure below.
Figure \(\PageIndex{1}\): Outcomes of three rolls in succession understood in terms of successes and failures
The following figure groups the \(8\) possibilities by value of \(X\) and helps us build the probability distribution.\begin{array}{|c|lll|} \hline X=0&\small\color{red}\text{F}\color{red}\text{F}\color{red}\text{F}& && \\X=1&\small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}&\small \color{red}\text{F}\color{blue}\text{S}\color{red}\text{F} &\small \color{red}\text{F}\color{red}\text{F}\color{blue}\text{S} \\X=2&\small \color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}&\small \color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S} &\small \color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S} \\X=3&\small \color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}& && \\ \hline \nonumber \end{array}
Table \(\PageIndex{1}\): Initial probability distribution for the random variable \(X\)
| \(X=x_j\) | \(P(X=x_j)\) |
|---|---|
| \(0\) | \(P\left(\small\color{red}\text{F}\color{red}\text{F}\color{red}\text{F} \normalsize \right)\) |
| \(1\) | \(P\left( \small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}\normalsize\color{black}\text{ or }\small \color{red}\text{F}\color{blue}\text{S}\color{red}\text{F} \normalsize\color{black}\text{ or } \small \color{red}\text{F}\color{red}\text{F}\color{blue}\text{S} \normalsize \right)\) |
| \(2\) | \(P\left(\small \color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\text{ or } \small \color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\text{ or } \small \color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S}\normalsize \right)\) |
| \(3\) | \(P\left(\small \color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}\normalsize \right)\) |
Each trial occurs in sequence and is identical and independent; we can use both our addition and multiplication rules for probabilities to determine our probabilities. Remember that \(P\left( \small \color{blue}\text{S}\normalsize\right)\) \(=p\) \(=\frac{1}{6}\) and \(P\left( \small \color{red}\text{F}\normalsize\right)\) \(=q\) \(=\frac{5}{6}.\)
Table \(\PageIndex{2}\): Probability distribution for the random variable \(X\)
| \(X=x_j\) | \(P(X=x_j)\) | |
|---|---|---|
| \(1\) | \(\begin{align*}P\left( \small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}\normalsize\color{black}\text{ or }\small \color{red}\text{F}\color{blue}\text{S}\color{red}\text{F} \normalsize\color{black}\text{ or } \small \color{red}\text{F}\color{red}\text{F}\color{blue}\text{S} \normalsize \right)&=P\left( \small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}\normalsize\right)+P\left(\small \color{red}\text{F}\color{blue}\text{S}\color{red}\text{F} \normalsize\right)+P\left(\small \color{red}\text{F}\color{red}\text{F}\color{blue}\text{S} \normalsize \right)\\&=P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)+P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)+P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\\&=3P\left(\small\color{blue}\text{S}\normalsize\right)\color{black} \cdot P\left(\small\color{red}\text{F}\normalsize\right)^2\\&=3pq^2\end{align*}\) | \(3\cdot\dfrac{1}{6}\cdot\left(\dfrac{5}{6}\right)^2=\dfrac{75}{216}\approx34.7222\%\) |
| \(3\) | \(\begin{align*}P\left(\small \color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}\normalsize \right)&=P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\\&=P\left(\small\color{blue}\text{S}\normalsize\right)^3\\&=p^3 \end{align*}\) | \(\left(\dfrac{1}{6}\right)^3=\dfrac{1}{216}\approx0.4630\%\) |
| \(2\) | \(\begin{align*}P\left(\small \color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\text{ or } \small \color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\text{ or } \small \color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S}\normalsize \right)&=P\left(\small \color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}\normalsize\right)+P\left(\small \color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\normalsize\right)+P\left(\small \color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S}\normalsize \right) \\&=P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)+P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)+P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\color{black}\cdot P\left(\small\color{blue}\text{S}\normalsize\right)\\&=3P\left(\small\color{blue}\text{S}\normalsize\right)^2\color{black} \cdot P\left(\small\color{red}\text{F}\normalsize\right)\\&=3p^2q\end{align*}\) | \(3\cdot\left(\dfrac{1}{6}\right)^2\cdot\dfrac{5}{6}=\dfrac{15}{216}\approx6.9444\%\) |
| \(0\) | \(\begin{align*}P\left(\small\color{red}\text{F}\color{red}\text{F}\color{red}\text{F} \normalsize \right)&=P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)\color{black}\cdot P\left(\small\color{red}\text{F}\normalsize\right)\\&=P\left(\small\color{red}\text{F}\normalsize\right)^3\\&=q^3\end{align*}\) | \(\left(\dfrac{5}{6}\right)^3=\dfrac{125}{216}\approx57.8704\%\) |
Consider the random variable \(Y\) that counts the number of heads in \(4\) flips of a fair coin. Verify that the random variable \(Y\) is a binomial random variable and construct its probability distribution.
- Answer
-
The underlying random experiment is the flipping of the fair coin, which is to be repeated a fixed number of times; \(n=4\). We are flipping the same coin in a similar fashion, meaning our trials are identical. We have independent trials because the outcome of one trial does not affect any of the other trials. We are counting the number of heads; in a successful trial, heads land face up, and alternatively, landing a tail would be a failure. Since we are using a fair coin, we have \(P\left( \small \color{blue}\text{S}\normalsize\right)=P\left( \small \color{red}\text{F}\normalsize\right)=\frac{1}{2}\) or \(p=q=\frac{1}{2}\) confirming that \(Y\) is a binomial random variable. Since we have \(4\) trials, the set of possible values for \(Y\) is \(\{0,\)\(1,\)\(2,\)\(3,\)\(4\}.\) To construct the probability distribution, we consider the possible outcomes of all \(4\) trials in terms of successes and failures.
\begin{array}{|c|llllll|} \hline Y=0&\small\color{red}\text{F}\color{red}\text{F}\color{red}\text{F}\color{red}\text{F}&&&&\\Y=1&\small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}\color{red}\text{F}&\small \color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\color{red}\text{F} &\small \color{red}\text{F}\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F} & \small\color{red}\text{F}\color{red}\text{F}\color{red}\text{F}\color{blue}\text{S} &&\\Y=2&\small \color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}&\small \color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}&\small \color{blue}\text{S}\color{red}\text{F}\color{red}\text{F}\color{blue}\text{S}&\small \color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}&\small \color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}&\small \color{red}\text{F}\color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S}\\Y=3&\small \color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}&\small \color{blue}\text{S}\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S} &\small \color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S} & \small\color{red}\text{F}\color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}\\Y=4&\small \color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}\color{blue}\text{S}& &&&& \\ \hline \nonumber \end{array}
Table \(\PageIndex{3}\): Initial probability distribution for the random variable \(Y\)
Again, each trial occurs in sequence and is identical and independent; we use both our addition and multiplication rule for probabilities to determine our probabilities.\(Y=y_j\) \(P(Y=y_j)\) \(0\) \(P\left(\small\color{red}\text{FFFF}\normalsize\color{black}\right)\) \(1\) \(P\left(\small\color{blue}\text{S}\color{red}\text{FFF}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{S}\color{red}\text{FF}\normalsize\color{black}\text{ or }\small\color{red}\text{FF}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{red}\text{FFF}\color{blue}\text{S}\normalsize\color{black}\right)\) \(2\) \(P\left(\small\color{blue}\text{SS}\color{red}\text{FF}\normalsize\color{black}\text{ or }\small\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{blue}\text{S}\color{red}\text{FF}\color{blue}\text{S}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{SS}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\text{ or }\small\color{red}\text{FF}\color{blue}\text{SS}\normalsize\color{black}\right)\) \(3\) \(P\left(\small\color{blue}\text{SSS}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{blue}\text{SS}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\text{ or }\small\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{SS}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{SSS}\normalsize\color{black}\right)\) \(4\) \(P\left(\small\color{blue}\text{SSSS}\normalsize\color{black}\right)\) Table \(\PageIndex{4}\): Probability distribution for the random variable \(Y\)
\(Y=y_j\) \(P(Y=y_j)\) \(0\) \(\begin{align*}P\left(\small\color{red}\text{FFFF}\normalsize\color{black}\right)&=P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot \\&=P\left(\small\color{red}\text{F}\normalsize\color{black}\right)^4\\&=q^4\end{align*}\) \(\left(\dfrac{1}{2}\right)^4=\dfrac{1}{16}=6.25\%\) \(1\) \(\begin{align*}P\left(\small\color{blue}\text{S}\color{red}\text{FFF}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{S}\color{red}\text{FF}\normalsize\color{black}\text{ or }\small\color{red}\text{FF}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{red}\text{FFF}\color{blue}\text{S}\normalsize\color{black}\right)&=P\left(\small\color{blue}\text{S}\color{red}\text{FFF}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\color{blue}\text{S}\color{red}\text{FF}\normalsize\color{black}\right)+P\left(\small\color{red}\text{FF}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\right)+P\left(\small\color{red}\text{FFF}\color{blue}\text{S}\normalsize\color{black}\right)\\[5pt]&=P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\\&+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\\[5pt]&=4P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)^3\\&=4pq^3\end{align*}\) \(4\cdot \left(\dfrac{1}{2}\right) \cdot \left(\dfrac{1}{2}\right)^3=\dfrac{4}{16}=25\%\) \(2\) \(\begin{align*}P\left(\small\color{blue}\text{SS}\color{red}\text{FF}\normalsize\color{black}\text{ or }\small\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{blue}\text{S}\color{red}\text{FF}\color{blue}\text{S}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{SS}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\text{ or }\small\color{red}\text{FF}\color{blue}\text{SS}\normalsize\color{black}\right)&=P\left(\small\color{blue}\text{SS}\color{red}\text{FF}\normalsize\color{black}\right)+P\left(\small\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\normalsize\color{black}\right)\\&+P\left(\small\color{blue}\text{S}\color{red}\text{FF}\color{blue}\text{S}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\color{blue}\text{SS}\color{red}\text{F}\normalsize\color{black}\right)\\&+P\left(\small\color{red}\text{F}\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\right)+P\left(\small\color{red}\text{FF}\color{blue}\text{SS}\normalsize\color{black}\right)\\[5pt]&=P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)+P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\\&+P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\\&+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\\[5pt]&=6P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)^2\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)^2\\&=6p^2q^2\end{align*}\) \(6\cdot \left(\dfrac{1}{2}\right)^2 \cdot \left(\dfrac{1}{2}\right)^2=\dfrac{6}{16}=37.5\%\) \(3\) \(\begin{align*}P\left(\small\color{blue}\text{SSS}\color{red}\text{F}\normalsize\color{black}\text{ or }\small\color{blue}\text{SS}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\text{ or }\small\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{SS}\normalsize\color{black}\text{ or }\small\color{red}\text{F}\color{blue}\text{SSS}\normalsize\color{black}\right)&=P\left(\small\color{blue}\text{SSS}\color{red}\text{F}\normalsize\color{black}\right)+P\left(\small\color{blue}\text{SS}\color{red}\text{F}\color{blue}\text{S}\normalsize\color{black}\right)+P\left(\small\color{blue}\text{S}\color{red}\text{F}\color{blue}\text{SS}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\color{blue}\text{SSS}\normalsize\color{black}\right)\\[5pt]&=P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)+P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\\&+P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)+P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\\[5pt]&=4P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)^3\cdot P\left(\small\color{red}\text{F}\normalsize\color{black}\right)\\&=4p^3q\end{align*}\) \(4\cdot \left(\dfrac{1}{2}\right)^3 \cdot \left(\dfrac{1}{2}\right)^2=\dfrac{4}{16}=25\%\) \(4\) \(\begin{align*}P\left(\small\color{blue}\text{SSSS}\normalsize\color{black}\right)&=P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)\cdot \\&=P\left(\small\color{blue}\text{S}\normalsize\color{black}\right)^4\\&=p^4\end{align*}\) \(\left(\dfrac{1}{2}\right)^4=\dfrac{1}{16}=6.25\%\)
Hopefully, we have noticed some patterns after building probability distributions for two binomial random variables. Let us formulate the patterns in the context of a general binomial random variable \(X\) with \(n\) trials and probability of success on any individual trial \(p.\) Recall that \(q,\) the probability of failure, is \(1-p.\)
When we consider the possible outcomes of all trials in terms of successes and failures, the probabilities depend on the number of successes and failures, not on the order in which those successes and failures appear. Each event in \(X=x_j\) has the same probability. If there are \(x_j\) successes, meaning we have \(n-x_j\) failures, the probability of each event in \(X=x_j\) is \(p^{x_j}q^{n-x_j}.\) All that is left to do is count the number of such events for any particular value \(x_j.\)
To count the number of ways that \(x_j\) successes can be assigned to the \(n\) trials, we can use combinations: \(\sideset{_{n}}{_{x_j}}C.\) We have \(\sideset{_{n}}{_{x_j}}C\) many events in \(X=x_j\) each with a probability of \(p^{x_j}q^{n-x_j}.\) Putting this all together, we arrive at a function that returns the probability of our binomial random variable. We call this the probability distribution function for a binomial random variable \(X.\) \[P(X=x_j)=\sideset{_{n}}{_{x_j}}C p^{x_j}q^{n-x_j}\nonumber\]Check that the formula works by using it on the preceding example.
- A virtual education company produces short multiple-choice quizzes for each content module. They currently have \(5\) questions with \(2\) options for each question. One school that uses this product worries about students passing these quizzes without learning the content. Determine the probability of a student obtaining an A or a B (obtaining at least an \(80\%,\)) on such a quiz by literally randomly guessing on each question.
- Answer
-
We can understand this situation as a binomial random variable. We have a random experiment of a student randomly guessing on a multiple-choice question. The experiment is repeated \(5\) times because there are \(5\) questions. Since the student is randomly guessing on each question, the trials are identical and independent, with a probability of success at \(50\%.\) Let \(X\) be the binomial random variable that counts the number of correct guesses on these \(5\) question multiple-choice quizzes. This means a student needs \(4\) or \(5\) correct answers to obtain an A or a B. We need to find \(P(X=4\text{ or }5.\) We can use the probability distribution function to find the answer. Recall that \(n=5\) and \(p=0.5.\) Thus,\[P(X=4)=\sideset{_{5}}{_{4}}C \cdot \left(\dfrac{1}{2}\right)^{4}\cdot \left(\dfrac{1}{2}\right)^{5-4}=5\cdot \left(\dfrac{1}{16}\right)\cdot \left(\dfrac{1}{2}\right)=\dfrac{5}{32} \\ \\ P(X=5)=\sideset{_{5}}{_{5}}C \cdot \left(\dfrac{1}{2}\right)^{5}\cdot \left(\dfrac{1}{2}\right)^{5-5}=1\cdot \left(\dfrac{1}{32}\right)\cdot \left(\dfrac{1}{2}\right)^0=\dfrac{1}{32}\nonumber\]Thus, \(P(X=4\text{ or }5)\) \(=\dfrac{5}{32}\)\(+\dfrac{1}{32}\) \(=\dfrac{6}{32}\) \(=18.75\%.\)
- Given the analysis in the first part of this text exercise, the virtual education company has decided to increase the number of options on each question while keeping the number of questions fixed at \(5.\) They are considering using \(3\) or \(4\) options. Determine the probability that a student randomly guessing on a quiz will obtain an A or a B under both options.
- Answer
-
Changing the number of options does not change the number of questions necessary, but it does change the probability of success on any given question. Let \(Y\) be the binomial random variable counting the number of correct guesses when there are \(3\) options on each question and \(Z\) be for \(4\) options. Thus, we are interested in \(P(Y=4\text{ or }5)\) and \(P(Z=4\text{ or }5).\) When there are \(3\) options, the probability of success, \(p_Y\) is only \(\frac{1}{3}.\) Similarly, when there are \(4\) options, the probability of success, \(p_Z\) is only \(\frac{1}{4}.\)
\[P(Y=4)=\sideset{_{5}}{_{4}}C \cdot \left(\dfrac{1}{3}\right)^{4}\cdot \left(\dfrac{2}{3}\right)^{5-4}=5\cdot \left(\dfrac{1}{81}\right)\cdot \left(\dfrac{2}{3}\right)=\dfrac{10}{243} \\ \\ P(Y=5)=\sideset{_{5}}{_{5}}C \cdot \left(\dfrac{1}{3}\right)^{5}\cdot \left(\dfrac{2}{3}\right)^{5-5}=1\cdot \left(\dfrac{1}{243}\right)\cdot \left(\dfrac{2}{3}\right)^0=\dfrac{1}{243} \\ \\ P(Z=4)=\sideset{_{5}}{_{4}}C \cdot \left(\dfrac{1}{4}\right)^{4}\cdot \left(\dfrac{3}{4}\right)^{5-4}=5\cdot \left(\dfrac{1}{256}\right)\cdot \left(\dfrac{3}{4}\right)=\dfrac{15}{1024} \\ \\ P(Z=5)=\sideset{_{5}}{_{5}}C \cdot \left(\dfrac{1}{4}\right)^{5}\cdot \left(\dfrac{3}{4}\right)^{5-5}=1\cdot \left(\dfrac{1}{1024}\right)\cdot \left(\dfrac{3}{4}\right)^0=\dfrac{1}{1024}\nonumber\]Thus, \(P(Y=4\text{ or }5)\) \(=\dfrac{10}{243}\)\(+\dfrac{1}{243}\) \(=\dfrac{11}{243}\) \(\approx4.5267\%\) and \(P(Z=4\text{ or }5)\) \(=\dfrac{15}{1024}\)\(+\dfrac{1}{1024}\) \(=\dfrac{16}{1024}\) \(\approx1.5625\%.\) Increasing the number of options significantly reduces the chances of a student obtaining an A or a B on a quiz by randomly selecting answers. We go from nearly \(19\%\) to just below \(5\%\) to just over \(1\%.\)
Expected Value, Variance, and Standard Deviation of Binomial Random Variables
Remember that binomial random variables are just a particular type of discrete random variable. That means everything we know about discrete random variables applies to binomial random variables. Binomial random variables have some very nice properties that make the calculations of expected value and variance much easier. Note that the formulas we develop here in this section only apply to binomial random variables and not all discrete random variables.
Using the definitions of expected value, variance, and standard deviation provided in the section on discrete random variables , determine these measures of centrality and dispersion for the binomial random variables: \(X\) being the number of ones rolled in \(3\) tosses of a fair die and \(Y\) being the number of heads in \(4\) flips of a fair coin.
- Answer
-
These are the same random variables that we have been using throughout this section. We can utilize the probability distributions that we have already created.
Table \(\PageIndex{5}\): Table of computation for the random variable \(X\)
\(X=x_j\) \(P(X=x_j)\) \(x_j \cdot P(X=x_j)\) \(\left(x_j - \mu\right)^2 \cdot P(X=x_j)\) \(0\) \(\dfrac{125}{216}\) \(0\cdot \dfrac{125}{216}=0\) \(\left(0-\dfrac{1}{2}\right)^2\cdot \dfrac{125}{216}=\dfrac{1}{4}\cdot \dfrac{125}{216}=\dfrac{125}{864}\) \(1\) \(\dfrac{75}{216}\) \(1\cdot \dfrac{75}{216}=\dfrac{75}{216}\) \(\left(1-\dfrac{1}{2}\right)^2\cdot \dfrac{75}{216}=\dfrac{1}{4}\cdot \dfrac{75}{216}=\dfrac{75}{864}\) \(2\) \(\dfrac{15}{216}\) \(2\cdot \dfrac{15}{216}=\dfrac{30}{216}\) \(\left(2-\dfrac{1}{2}\right)^2\cdot \dfrac{15}{216}=\dfrac{9}{4}\cdot \dfrac{15}{216}=\dfrac{135}{864}\) \(3\) \(\dfrac{1}{216}\) \(3\cdot \dfrac{1}{216}=\dfrac{3}{216}\) \(\left(3-\dfrac{1}{2}\right)^2\cdot \dfrac{1}{216}=\dfrac{25}{4}\cdot \dfrac{1}{216}=\dfrac{25}{864}\) \(\mu=\text{E}(X)=0+\dfrac{75}{216}+\dfrac{30}{216}+\dfrac{3}{216}=\dfrac{108}{216}=\dfrac{1}{2}\) \(\sigma^2=\text{Var}(X)=\dfrac{125}{864}+\dfrac{75}{864}+\dfrac{135}{864}+\dfrac{25}{864}=\dfrac{360}{864}=\dfrac{5}{12}\approx0.4167\) \(\sigma=\sqrt{\text{Var}(X)}=\sqrt{\dfrac{5}{12}}\approx0.6455\) Table \(\PageIndex{6}\): Table of computation for the random variable \(Y\)
\(Y=y_j\) \(P(Y=y_j)\) \(y_j \cdot P(Y=y_j)\) \(\left(y_j - \mu\right)^2 \cdot P(Y=y_j)\) \(0\) \(\dfrac{1}{16}\) \(0\cdot \dfrac{1}{16}=0\) \(\left(0-2\right)^2\cdot \dfrac{1}{16}=4\cdot \dfrac{1}{16}=\dfrac{1}{4}\) \(1\) \(\dfrac{1}{4}\) \(1\cdot \dfrac{1}{4}=\dfrac{1}{4}\) \(\left(1-2\right)^2\cdot \dfrac{1}{4}=1\cdot \dfrac{1}{4}=\dfrac{1}{4}\) \(2\) \(\dfrac{3}{8}\) \(2\cdot \dfrac{3}{8}=\dfrac{3}{4}\) \(\left(2-2\right)^2\cdot \dfrac{3}{8}=0\cdot \dfrac{3}{8}=0\) \(3\) \(\dfrac{1}{4}\) \(3\cdot \dfrac{1}{4}=\dfrac{3}{4}\) \(\left(3-2\right)^2\cdot \dfrac{1}{4}=1\cdot \dfrac{1}{4}=\dfrac{1}{4}\) \(4\) \(\dfrac{1}{16}\) \(4\cdot \dfrac{1}{16}=\dfrac{1}{4}\) \(\left(4-2\right)^2\cdot \dfrac{1}{16}=4\cdot \dfrac{1}{16}=\dfrac{1}{4}\) \(\mu=\text{E}(Y)=0+\dfrac{1}{4}+\dots+\dfrac{1}{4}=\dfrac{8}{4}=2\) \(\sigma^2=\text{Var}(X)=\dfrac{1}{4}+\dfrac{1}{4}+\ldots+\dfrac{1}{4}=\dfrac{4}{4}=1\) \(\sigma=\sqrt{\text{Var}(X)}=\sqrt{1}=1\)
Having computed the expected value, variance, and standard deviation for two binomial random variables using the definitions, we now present quicker and easier methods for computing the expected value and variance. Just as with the alternative formula for the variance of a discrete random variable, these formulas are derived from our original definitions through mathematical simplification and produce the same values as the original definitions. We will not provide the work for this mathematical simplification but will provide a little intuition before providing the formulas. For example, if \(p=0.5\) and \(n=10,\) then we are repeating a trial \(10\) times with probability of success being \(0.5.\) We should expect, then, that half of the time, we will succeed. This means \(E(X)\) \(=0.5\cdot 10\) \(= 5.\) Similarly, if \(p=0.9\) and \(n=100,\) we should expect to see success \(90\%\) of the time, so \(E(X)\) \(=0.9\cdot 100\) \(= 90.\) In general, \(E(X)=np\) for binomial distributions. For a binomial random variable \(X\) with \(n\) trials, probability of success on any individual trial \(p,\) and probability of failure on any individual trial \(q,\) we can compute the expected value and variance using the following formulas.\[\mu=\text{E}(X)=np \\ \sigma^2=\text{Var}(X)=npq \nonumber\]
Using the above formulas, compute the expected value and variance for the random variables: \(X\) being the number of ones rolled in \(3\) tosses of a fair die and \(Y\) being the number of heads in \(4\) flips of a fair coin. Verify that the values match what was computed in the previous text exercise.
- Answer
-
When considering the random variable \(X,\) we have that \(n=3,\) \(p=\frac{1}{6},\) and \(q=\frac{5}{6}.\) We thus compute \(\mu\) \(=\text{E}(X)\) \(=3\cdot\frac{1}{6}\) \(=\frac{1}{2}\) and \(\sigma^2\) \(=\text{Var}(X)\) \(=3\cdot\frac{1}{6}\cdot\frac{5}{6}\) \(=\frac{5}{12}.\) These values match what was computed in the previous exercise.
When considering the random variable \(Y,\) we have that \(n=4,\) \(p=\frac{1}{2},\) and \(q=\frac{1}{2}.\) We thus compute \(\mu\) \(=\text{E}(Y)\) \(=4\cdot\frac{1}{2}\) \(=2\) and \(\sigma^2\) \(=\text{Var}(X)\) \(=4\cdot\frac{1}{2}\cdot\frac{1}{2}\) \(=1.\) These values again match what was computed in the previous exercise.
Necessity of Independent Trials
Binomial distributions are related to important distributions in inferential statistics, such as computing the probability of obtaining a sample with a particular proportion. Recall our discussion regarding obtaining a random sample from a large population and having \(80\%\) of them be women. The probability of this happening was significantly less with a sample size of \(20\) as opposed to \(10\) \((0.4621\%\text{ vs }\)\(4.3945\%).\) These probabilities were computed using the binomial distribution. Here, we treated our random experiment as selecting an individual from a large enough population composed of equal numbers of men and women. We considered selecting a woman a success and treated \(p\) \(=q\) \(=\frac{1}{2}.\) In the case of a sample of size of \(10,\) we noted that \(80\%\) of \(10\) is \(8.\) And in the case of a sample size of \(20,\) we need \(16\) women to get \(80\%.\) However, this fails to satisfy our definition of a binomial random variable because we do not have the same probabilities of success and failure for each trial. When one person is chosen, that person is no longer eligible to be chosen for subsequent trials. We have fewer people to choose from and no longer equal numbers of men and women. Our trials are not independent.
We have run into this issue previously in a text exercise . When populations are huge (when the difference between an event's probability and a conditional probability related to that event is relatively small) treating the events as if they were independent will result in a value which is approximately, not exactly, correct. Since it is often much easier to compute assuming independence, this is common practice when the error would be negligible. It is difficult to define exactly how large a population must be, in general, for the assumption of independence to be reasonable. For example, if there are \(1,000,000\) people, exactly half of which are women, and we randomly select \(2\) individuals from this group, the probability that they are both women would be \(\frac{500,000}{1,000,000} \cdot \frac{499,999}{999,999}\) \(\approx 0.24999975.\) If we had assumed independence, that is, that each time we selected a person, there was a \(50\%\) chance it was a woman, we would have obtained \(\frac{1}{2} \cdot \frac{1}{2}\) \(= 0.25.\) Notice the error we get from assuming independence is quite small. On the other hand, if the population size were \(6\) and \(3\) of them were women, the assumption of independence is much less reasonable. If we randomly select \(2\) people from this group of \(6,\) the probability that they are both women is \(\frac{3}{6} \cdot \frac{2}{5}\) \(= \frac{1}{5}\) \(=0.2.\) Simply saying there's a \(50\%\) chance each time obtains an estimate of \(0.25.\) Notice the error is much larger than before. If we take our population size to be even smaller, the error gets larger. In summary, if the sample we are selecting is a tiny proportion of the population, then assuming independence introduces little error; however, if we assume independence when the sample is a significant proportion of the population, then we will have large errors in our estimates. The following exercise illustrates in more detail how much error there is in different population sizes.
- Consider sampling \(10\) people from a population composed of an equal number of men and women. We denote the outcome of such a sampling as a sequence of \(\small\color{blue}\text{W}\) and \(\small\color{red}\text{M}\). Determine \(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\) for each of population size.
- \(N=50\)
- \(N=100\)
- \(N=200\)
- \(N=1000\)
- Answer
-
- \(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\)\(=\frac{25}{50}\cdot\)\(\frac{24}{49}\cdot\)\(\frac{23}{48}\cdot\)\(\frac{22}{47}\cdot\)\(\frac{21}{46}\cdot\)\(\frac{20}{45}\cdot\)\(\frac{19}{44}\cdot\)\(\frac{18}{43}\cdot\)\(\frac{25}{42}\cdot\)\(\frac{24}{41}\) \(\approx0.0702\%\)
- \(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\)\(=\frac{50}{100}\cdot\)\(\frac{49}{99}\cdot\)\(\frac{48}{98}\cdot\)\(\frac{47}{97}\cdot\)\(\frac{46}{96}\cdot\)\(\frac{45}{95}\cdot\)\(\frac{44}{94}\cdot\)\(\frac{43}{93}\cdot\)\(\frac{25}{92}\cdot\)\(\frac{24}{91}\) \(\approx0.08443\%\)
- \(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\)\(=\frac{100}{200}\cdot\)\(\frac{99}{199}\cdot\)\(\frac{98}{198}\cdot\)\(\frac{97}{197}\cdot\)\(\frac{96}{196}\cdot\)\(\frac{95}{195}\cdot\)\(\frac{94}{194}\cdot\)\(\frac{93}{193}\cdot\)\(\frac{100}{192}\cdot\)\(\frac{99}{191}\) \(\approx0.09117\%\)
- \(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\)\(=\frac{500}{1000}\cdot\)\(\frac{499}{999}\cdot\)\(\frac{498}{998}\cdot\)\(\frac{497}{997}\cdot\)\(\frac{496}{996}\cdot\)\(\frac{495}{995}\cdot\)\(\frac{494}{994}\cdot\)\(\frac{493}{993}\cdot\)\(\frac{500}{992}\cdot\)\(\frac{499}{991}\) \(\approx0.09638\%\)
- Determine the \(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\) as if each selection were independent with \(P(\small\color{blue}\text{W}\color{black})\) \(=\frac{1}{2}\) and \(P(\small\color{red}\text{M}\color{black})\) \(=\frac{1}{2}\).
- Answer
-
\(P(\small\color{blue}\text{WWWWWWWW}\color{red}\text{MM}\color{black})\) \(=\left(\frac{1}{2}\right)^{10}\) \(\approx0.09766\%\)
- Compare the value computed in each part of part \(1\) with the value computed in part \(2\) of this text exercise.
- Answer
-
- The difference is \(0.02746\%.\)
- The difference is \(0.01323\%.\)
- The difference is \(0.0065\%.\)
- The difference is \(0.0013\%.\)
The difference in computations of these values is in the hundredths and thousandths of a percent and decreases as the population increases. We only dealt with population sizes up to \(1000.\) In general, our populations of interest will be much larger than that so that we would expect even smaller differences. The comparison between sample size and population size is really at play down deep. Without going into the details, we share a fairly common recommendation. If the sample size is more than \(5\%\) of the population, we do not assume independence.