3.1: Introduction to Probability
- Page ID
- 44245
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Learning Objectives
- Learn key initial terminology about probability
- Determine the sample space of a given situation
- Recognize and use the three basic methods of determining probability measures
- Explain the importance of the Law of Large Numbers
- Describe the complement of an event
- Use the Complement Probability Rule for determining the probability of an event
Review and Preview
Inferential statistics seeks to make educated guesses about populations using statistics from randomly chosen samples. The usefulness of a sample statistic depends on the sample from which it was taken. We cannot guarantee that samples are representative of the population, but we can ensure that any bias present is due to random chance. What is the likelihood that our randomly chosen sample is representative? In other words, what is the probability that our sample statistic is close to the population parameter? These are fundamental questions to examine through the science of probability.
We hear probability claims daily. The weather forecast states a \(30\%\) chance of rain. The probability of a faulty product coming off a manufacturing line is \(6.5\%\). One has a \(20\%\) probability of purely guessing an answer correctly on a single multiple-choice question. A cancer research group believes that \(40\%\) of women and \(45\%\) of men will have a diagnosis of some type of cancer during their lifetimes. (Note: This means that if we randomly select a man, we have a \(45\%\) chance of choosing someone who has had or will have a cancer diagnosis in his life. This does not mean that any specific man has a \(45\%\) chance of being diagnosed with cancer in his lifetime.)
Each of the above scenarios involves a situation in which something will happen, and an outcome will occur, but we are uncertain which outcome it will be. Will it rain, or will it not rain? Is the next product produced of high quality or not? In statistics, we refer to such situations as random experiments. We have a clear context and an idea of possible outcomes, one of which will happen, and then use probabilities to measure the likelihood of any particular outcome.
As we move through the course, we will develop an understanding of random sampling to build probabilities. We will develop the ability to measure how unusual a sample statistic is within a given situation. If the probability of a calculated sample statistic is "small," then we conclude that the outcome is unusual. The use of probability goes well beyond this application; therefore, we develop probability within a larger context.
Basic Concept of Probability
We begin by examining the meaning of the term probability. Generally, probability is a numerical value inclusively between \(0\) and \(1 \) measuring the likelihood that a specific event will occur within a given situation. The representation of that numerical value might be in decimal form, fractional form, or percentage form, such as \(0.75\) \(=\frac{3}{4}\) \(=75\%.\) An outcome of a random experiment is a potential result of the experiment. The term event is a set of outcomes one might expect from a given random experiment. For example, if one rolls a pair of standard game dice as shown in Figure \(\PageIndex{1}\), a possible event could be both dice landing with one facing up.
Figure \(\PageIndex{1}\): Game Dice
Before measuring probability accurately, we must clearly describe the given situation (such as rolling a pair of standard game dice) and the event of interest (such as both dice landing with one facing up). For a given clearly described situation, the collection of all possible outcomes is called the sample space or event space. For reasonably simple situations, one can fully describe the sample space.
What is the sample space if one rolls a pair of standard game dice, as shown in Figure \( \PageIndex{1} \)?
- Answer
-
The graphic below shows the sample space, where one die outcome is red and the other in white. Notice there are thirty-six possible outcomes. We should also notice that, in general, we must be concerned with the order of the two dice. For example, we consider rolling a three on the red die, and a one on the white die a different outcome than rolling a one on the red die and a three on the white die. Why is this? Because these are two outcomes that we can distinguish. Since probability measures the degree of uncertainty, one must consider all available information. We produce a complete sample space by carefully reflecting on all possible outcomes.
Figure \(\PageIndex{2}\): Sample space of rolling a pair of standard game dice
When rolling a pair of dice, the event of rolling two ones is an outcome. However, the event in which both dice land with a sum of five describes several outcomes in the sample space: \(1\) and \(4;\) \(2\) and \(3;\) \(3\) and \(2;\) and \(4\) and \(1.\)
Once we understand the sample space, we can examine the probability measures of various events within that situation. A probability measure near \(1\) indicates that the specific event is more likely to occur, and a probability measure near \(0\) indicates that the specific event is less likely to occur. We use the symbols \(P \left( A \right) \) to indicate the probability of a specific event \( A.\) For example, \(P \left( \text{rain} \right)\) \(=.30\) \(= 30\% \) indicates "the probability of rain is \(30\%\)". If for some event \(A,\) we find that \(P \left( A \right)\) \(=1.00\) \(= 100\%, \) then we say this event is a certain event. Similarly, if for some event \(A,\) we find that \(P \left( A \right)\) \(=0.00\) \(= 0\%,\) then we say this event is an impossible event. For finite sample spaces, impossible events cannot occur, and certain events must occur. For infinite sample spaces, the situation is more complicated; we will discuss this further in chapter \(4.\) For now, we are usually interested in events that are possible but not likely to occur.
In inferential statistics, we are often interested in outcomes that are not likely to happen; that is, they are not very probable; they are unusual. When improbable outcomes occur, either something rare happens, or we have reason to think our understanding of the situation needs to be updated, but what is the cutoff between improbable and probable, between unusual and usual? There is no fixed, agreed-upon value that will work for every situation. For this reason, we will often need to set a probability measure for considering an event improbable. Different individuals can generally feel the same event usual/likely or unusual/unlikely, depending on their personal life experiences. For example, if the probability of rain is \(15\%,\) one person may consider rain unlikely, and another may think it reasonably likely. A commonly chosen boundary measure for many statistical analysis areas is \(5\%.\) We will initially require a probability measure of \(5\%\) or less to label an event as unusual. Later, we will relax on this requirement, even leaving the choice to the consumer of our statistical measures to apply their chosen probability measure for unusual.
- Explain why \(150\%\) cannot be the probability of some event.
- Answer
-
The value \(150\%\) is larger than \(100\%.\) Any valid probability value will be between \(0.00\) \(= 0\%\) and \(1.00\) \(= 100\%,\) inclusively. Knowing such will help us recognize improper probabilities.
- Explain why \(1.21\) cannot be the probability of some event.
- Answer
-
The value \(1.21\) is larger than \(1.00.\) Again, any valid probability value will be between \(0.00\) \(= 0\%\) and \(1.00\) \(= 100\%,\) inclusively.
- Explain why \(-0.22\) cannot be the probability of some event.
- Answer
-
The value \(-0.22\) is negative. Any valid probability value will be non-negative since the value must be between \(0.00\) \(= 0\%\) and \(1.00\) \(= 100\%.\)
- Give an example of a situation and an impossible event for that situation.
- Answer
-
Answers to this can vary greatly. An example tied to the situation about rolling pairs of dice would be the event of rolling two standard game dice with a sum of \(13\) up. Since the largest each die can be is \(6,\) the sum cannot exceed \(12;\) the event is an impossible event. That is, \(P \left( \text{rolling a sum of }13 \right)\) \(= 0.00\%.\)
There is one final fundamental property of probability for the events and sample space within any given situation. The sum of the probabilities of all possible outcomes within the sample space of a given situation must always total \( 1.00\) \(= 100\%.\)
Basic Methods for Computing Simple Probabilities
With our basic terminology established, we focus on how to compute a given situation's probability. We initially examine three basic and commonly used methods. Our first method is called the subjective or intuitive method, where we produce a numerical estimate of the probability based on personal judgment, past experiences, or even personal opinion. For example, having purchased a few non-winning lottery tickets in the past, yet hearing of a winner on the news, we might estimate the probability of winning as a low value of \(1\%.\) We do not depend on subjectively determined probability values in quality statistical work. Instead, we form more robust and accurate methods for determining these values.
This leads us to our second method, commonly called the classical method. In this method, if a given situation has \(n\) different equally likely outcomes in the sample space and if event \(A \) can occur in \(x\) ways, then \[ \nonumber P\left( A \right) = \frac{\text{number of ways event }A \text{ can occur}}{\text{number of outcomes in the sample space}} = \frac{x}{n}. \]As an example, if a standard die is fair (so each face of the die has equal chance of landing up when the die is rolled), then \[ P\left( \small{\text{ROLLING A THREE}}\normalsize \right) = \frac{\text{number of ways rolling a three can occur}}{\text{number of outcomes in the sample space of rolling a die}} = \frac{1}{6} \approx 0.1667 = 16.67\% \nonumber\]Notice that this outcome would not be considered unusual since the probability measure is over \(5\%.\)
Consider the situation in which a pair of fair dice are rolled. Find the probability of each event given. Write results in probability notation and determine if the outcome is considered unusual.
- \(\small{\text{ROLLING }}\)\(\small{\text{A }}\)\(\small{\text{THREE }}\)\(\small{\text{ON }}\)\(\small{\text{THE }}\)\(\small{\text{FIRST }}\)\(\small{\text{DIE }}\)\(\small{\text{AND }}\)\(\small{\text{A }}\)\(\small{\text{FOUR }}\)\(\small{\text{ON }}\)\(\small{\text{THE }}\)\(\small{\text{SECOND }}\)\(\small{\text{DIE}}\).
- Answer
-
Figure \(\PageIndex{3}\): Sample space of rolling a pair of standard game dice
Since the dice are said to be fair dice, each of the thirty-six outcomes shown in our sample space above is equally likely. We will use the classical approach to determining the probability, using \( A \) to represent our event of \(\small{\text{ROLLING }}\)\(\small{\text{A }}\)\(\small{\text{THREE }}\)\(\small{\text{ON }}\)\(\small{\text{THE }}\)\(\small{\text{FIRST }}\)\(\small{\text{DIE }}\)\(\small{\text{AND }}\)\(\small{\text{A }}\)\(\small{\text{FOUR }}\)\(\small{\text{ON }}\)\(\small{\text{THE }}\)\(\small{\text{SECOND }}\)\(\small{\text{DIE}}\). We notice only one outcome in the sample space matches the event description. Therefore, \( P\left( A \right)\) \(= \frac{1}{36}\) \(\approx 0.0278\) \(= 2.78\%.\) Since this probability measure is less than \(5\%,\) then for our course, we will consider this outcome as unusual.
- \(\small{\text{ROLLING }}\)\(\small{\text{A }}\)\(\small{\text{THREE }}\)\(\small{\text{ON }}\)\(\small{\text{ONE }}\)\(\small{\text{DIE }}\)\(\small{\text{AND }}\)\(\small{\text{A }}\)\(\small{\text{FOUR }}\)\(\small{\text{ON }}\)\(\small{\text{THE }}\)\(\small{\text{OTHER }}\)\(\small{\text{DIE}}\).
- Answer
-
Figure \(\PageIndex{4}\): Sample space of rolling a pair of standard game dice
Using similar reasoning as above and using \( B \) to represent our event of \(\small{\text{ROLLING }}\)\(\small{\text{A }}\)\(\small{\text{THREE }}\)\(\small{\text{ON }}\)\(\small{\text{ONE }}\)\(\small{\text{DIE }}\)\(\small{\text{AND }}\)\(\small{\text{A }}\)\(\small{\text{FOUR }}\)\(\small{\text{ON }}\)\(\small{\text{THE }}\)\(\small{\text{OTHER }}\)\(\small{\text{DIE}}\), notice that there are two outcomes in the sample space that match the event description. Therefore, \( P\left( B \right)\) \(= \frac{2}{36}\) \(=\frac{1}{18}\) \(\approx 0.0556\) \(= 5.56\%.\) Since this probability measure is more than \(5\%,\) then for our course, we will not consider this outcome as unusual.
- \(\small{\text{ROLLING }}\)\(\small{\text{TWO }}\)\(\small{\text{DICE }}\)\(\small{\text{IN }}\)\(\small{\text{WHICH }}\)\(\small{\text{THE }}\)\(\small{\text{SUM }}\)\(\small{\text{OF }}\)\(\small{\text{THE }}\)\(\small{\text{NUMBER }}\)\(\small{\text{IS }}\)\(\small{\text{SIX}}\).
- Answer
-
Figure \(\PageIndex{5}\): Sample space of rolling a pair of standard game dice
Let \( C \) represent our event of \(\small{\text{ROLLING }}\)\(\small{\text{TWO }}\)\(\small{\text{DICE }}\)\(\small{\text{IN }}\)\(\small{\text{WHICH }}\)\(\small{\text{THE }}\)\(\small{\text{SUM }}\)\(\small{\text{OF }}\)\(\small{\text{THE }}\)\(\small{\text{NUMBER }}\)\(\small{\text{IS }}\)\(\small{\text{SIX}}\), we notice that there are five outcomes in the sample space that match the event description...can you find all of these? Therefore, \( P\left( C \right)\) \(= \frac{5}{36}\) \(\approx 0.1389\) \(= 13.89\%.\) Since this probability measure is more than \(5\%,\) then for our course, we will not consider this outcome as unusual.
Our third method of computing probabilities is the empirical, experimental, or relative frequency method. In this method, we repeatedly conduct an experiment, noting the outcomes of the trials, to establish an estimate of probability measures. In repeating a given experiment \(n\) times, and noting event \(A\) occurred \(f\) times, then \[ \nonumber P\left( A \right) = \frac{\text{number of times event }A \text{ occurred}}{\text{number of times situation/experiment was repeated}} = \frac{f}{n}. \]Notice how this method relates to our previous work producing relative frequency distributions when summarizing data sets. In a sense, we were producing probability measures with our relative frequency values.
The quality of the probability estimate is dependent on the number of repeated trials used. For example, a researcher finds that \(25\) of \(150\) randomly selected Kansas teens texted while driving during the last week, empirically indicating that the proportion of Kansas teens that drove while texting last week is \( \frac{25}{150}\) \(\approx 16.67\%.\) Equivalently, there is approximately a \(16.67\%\) probability that a randomly selected Kansas teen drove while texting last week. This measure based on only \(150\) teens does not give us high confidence in this estimated probability value; if the researcher could collect data from \(1,500\) Kansas teens, we could be more confident in the estimation.
In general, we require the number of carefully designed repeated trials to be as large as possible to produce an estimate of a probability value. Two underlying assumptions must be used with the experimental frequency approach. First, if an event occurred with a certain probability in past trials, this same event will occur about the same percentage of times in future trials. Second, the relative frequency probability of an event will tend to approach the true probability value as more and more trials are measured (this is commonly referred to as the Law of Large Numbers).
There are times when simulations (especially computer simulations) produce a large number of trials and a reasonably accurate measure of the true probability of specific events, especially when the outcomes in a situation are not equally likely. Much weather forecasting is based on computer simulation of outcomes based on regional weather conditions. Similarly, the spread of infectious diseases is modeled by computer simulations to predict outcomes while avoiding extensive medical research costs or without impacting living organisms. For example, suppose we wonder if our game die is fair--that each face has an equal likelihood of occurring when the die is rolled. We begin a simulation with the assumption that the die is fair, but upon rolling the die five times, we roll a one three of the five times--indicating an empirical probability of \( P (\small{\text{ROLL }}\)\(\small{\text{OF }}\)\(\small{\text{ONE}}\normalsize )\) \(= \frac{3}{5}\) \(=60\%.\) Of course, only rolling five times is insufficient for us to have high confidence in the accuracy of the empirical probability measure. We continue rolling the die three hundred times, in which we roll a one \(188\) times. This new empirical probability measure of \( P ( \small{\text{ROLL }}\)\(\small{\text{OF }}\)\(\small{\text{ONE}}\normalsize)\) \(= \frac{188}{300}\) \(\approx 62.67\% \) indicates that our die is not likely to be fair; the experimental probability for the one is not the expected fair value of \( P ( \small{\text{ROLL }}\)\(\small{\text{OF }}\)\(\small{\text{ONE}}\normalsize)\) \(= \frac{1}{6}\) \(\approx 16.67\%.\)
In \(1856,\) Gregor Mendel began to study different inherited features, such as the color of pea plants. According to one source, in a second-generation group of pea plants, \(6002\) peas produced by the plants were yellow, and \(2001\) were green in color. What was the empirical probability that a randomly selected second-generation pea would be green? Is this close to the hypothesized value that Mendel claimed of \( 25\%?\)
- Answer
-
We compute that \( P (\small{\text{GREEN }}\)\(\small{\text{COLORED }}\)\(\small{\text{PEA}}\normalsize)\) \(= \frac{2001}{6002 + 2001}\) \(\approx 25.0031\%.\) Mendel's experimental probability is extremely close to his hypothesized probability claim of \(25\%.\)
Complement and Complement Probabilities
There are times we will be interested in finding the probability that an event \(A\) does not occur. The collection of all outcomes in a sample space in which a given event \(A\) does not occur is called the complement of event \(A\) and is denoted by \(\bar{A}.\) Other sources may use different notations to denote the complement; common ones include \(A^c\) or \(\sim{A}.\) The idea of complementary events allows us to divide the sample space into two mutually exclusive groups (no outcome can be found in both of the two groups \(A\) and \(\bar{A}\)) and also exhaustive (every outcome of the sample space must be included in one of our two groups). For example, suppose \(A \) is the event of getting two different numbers on each die when two dice are rolled; there are \(30\) such outcomes in our sample space that meet this event description. \(\bar{A} \) is the event of getting the same numbers on each dice when two dice are rolled; there are \(6\) such outcomes in our sample space. Between \(A\) and \(\bar{A}\), all \(36\) outcomes appear exactly once.
Figure \(\PageIndex{2}\): Events \(A\) (blue background) and \(\bar{A}\) (grey background)
Since all outcomes of a situation must be either in the event or the complement of the event, we have the following three key consequences:
- \(P\left( A \right) + P\left( \bar{A} \right)\) \(= 1\) \(= 100\% \)
- \(1 - P\left( A \right)\) \(= P\left( \bar{A} \right) \)
- \(1- P\left( \bar{A} \right) = P\left( A \right)\)
Notice that the first consequence leads naturally to the other two with basic algebra. Suppose we wish to determine the probability of event \(A\) from above \((P ( \small{\text{GETTING }}\)\(\small{\text{TWO }}\)\(\small{\text{DIFFERENT }}\)\(\small{\text{NUMBERS }}\)\(\small{\text{OF }}\)\(\small{\text{PIPS }}\)\(\small{\text{ON }}\)\(\small{\text{EACH }}\)\(\small{\text{DIE }}\)\(\small{\text{WHEN }}\)\(\small{\text{TWO }}\)\(\small{\text{DICE }}\)\(\small{\text{ARE }}\)\(\small{\text{ROLLED}}\normalsize ))\). We can go back to our sample space to count all such outcomes or use the complement concept to produce our results quickly. We have\[ \begin{align*} P \left( A \right) &=1 - P \left( \bar{A} \right) \\[6pt] &= 1 - \frac{6}{36} \\[6pt] &= \frac{30}{36} = \frac{5}{6} \approx 0.8333 \text{ or } 83.33\% \end{align*}\]This example illustrates that it is sometimes easier to determine the probability of a complement event instead of the given event. Once we know the probability of the complement event, we can easily determine the probability of the event.
Use the concepts of complement events to answer the questions below:
- Suppose there are sixty tiles in a bag of which \(13\) are green, \(6\) are yellow, \(4\) are pink, \(9\) are red, \(8\) are purple, and \(20\) are black. The tiles are well-mixed. We will randomly draw one tile from the bag without looking into the bag. We want to determine the probability that we draw a tile that is not a primary color; that is, we are to find \( P ( \small{\text{NOT }}\)\(\small{\text{A }}\)\(\small{\text{PRIMARY }}\)\(\small{\text{COLORED }}\)\(\small{\text{TILE}}\normalsize ).\)
- Answer
-
As a reminder, there are three primary colors: red, yellow, and blue. Although the complement is not necessary to answer this question, the use of the complement makes the problem easier to compute. Since "a primary colored tile" is the complement event of "not a primary colored tile" in this situation, we notice that \( P ( \small{\text{NOT }}\)\(\small{\text{A }}\)\(\small{\text{PRIMARY }}\)\(\small{\text{COLORED }}\)\(\small{\text{TILE}}\normalsize )\) \(= 1 - P( \small{\text{A }}\)\(\small{\text{PRIMARY }}\)\(\small{\text{COLORED }}\)\(\small{\text{TILE}}\normalsize )\) \(= 1-\frac{6+9}{60}\) \(= \frac{45}{60}\) \(=75\%.\)
- A number is chosen randomly from the set of integers between \(1\) and \(99,\) inclusively. What is the probability of randomly selecting a number that is not a perfect square?
- Answer
-
We notice that there are only a few integers inclusively between \(1\) and \(99\) that are perfect squares and many that are not. Specifically the perfect square integers in this set are \( \{1,\) \(4,\) \(9,\) \(16,\) \(25,\) \(36,\) \(49,\) \(64,\) \(81\}.\) Thus \(P ( \small{\text{NOT }}\)\(\small{\text{A }}\)\(\small{\text{PERFECT }}\)\(\small{\text{SQUARE}}\normalsize)\) \(= 1 - P ( \small{\text{PERFECT }}\)\(\small{\text{SQUARE}}\normalsize)\) \(= 1 - \frac{9}{99}\) \(\approx 0.909\) \(= 90.9\%.\)
- Suppose we want to know the probability of randomly selecting a group of \(35\) people in which at least two people will have the same birth day in a year (such as September \(1^{st}\) or May \(28^{th}\) -- we will ignore leap years for simplicity). Suppose we also know the probability of randomly selecting a group of \(35\) people, in which no two people have the same birth day in the year, which is about \(18.56\%.\) (We will discuss how this value of \(18.56\%\) can be found later in the chapter). From this information, can we determine the probability of randomly selecting a group of \(35\) people in which at least two will have the same birth day in a year?
- Answer
-
We see that the complement of "at least two people will have the same birth day in a year" is the description "less than two people (that is none) will have the same birth day in a year." We can use our complement rule to note that \( P ( \small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{TWO }}\)\(\small{\text{WILL }}\)\(\small{\text{HAVE }}\)\(\small{\text{THE }}\)\(\small{\text{SAME }}\)\(\small{\text{BIRTHDAY}}\normalsize )\) \(= 1 - P( \small{\text{NO }}\)\(\small{\text{TWO }}\)\(\small{\text{WILL }}\)\(\small{\text{HAVE }}\)\(\small{\text{THE }}\)\(\small{\text{SAME }}\)\(\small{\text{BIRTHDAY}})\normalsize\) \(= 1-0.1856\) \(= 0.8144\) \(= 81.44.\%\)
There are times when clearly describing the complement of an event can be simple. For example, in the roll of a single standard game die, it is common to quickly describe the complement of \(\small{\text{AN }}\)\(\small{\text{EVEN }}\)\(\small{\text{NUMBER }}\)\(\small{\text{ON }}\)\(\small{\text{A }}\)\(\small{\text{ROLL}}\normalsize\) to be \(\small{\text{AN }}\)\(\small{\text{ODD }}\)\(\small{\text{NUMBER }}\)\(\small{\text{ON }}\)\(\small{\text{A }}\)\(\small{\text{ROLL}}\normalsize\). There are also times when clearly describing the complement of an event can be challenging. We must ensure the complement description covers all possible outcomes for a situation. For a different example, in random selection of a number from the real number line, it is common to quickly describe the complement of \(\small{\text{SELECTION }}\)\(\small{\text{OF }}\)\(\small{\text{A }}\)\(\small{\text{NEGATIVE }}\)\(\small{\text{NUMBER}}\normalsize\) as \(\small{\text{SELECTION }}\)\(\small{\text{OF }}\)\(\small{\text{A }}\)\(\small{\text{POSITIVE }}\)\(\small{\text{NUMBER}}\normalsize.\) However, there is the number, \(0,\) that is neither positive nor negative which means we have an incorrect complement description. The true complement to \(\small{\text{SELECTION }}\)\(\small{\text{OF }}\)\(\small{\text{A }}\)\(\small{\text{NEGATIVE }}\)\(\small{\text{NUMBER}}\normalsize\) is \(\small{\text{SELECTION }}\)\(\small{\text{OF }}\)\(\small{\text{A }}\)\(\small{\text{NON-NEGATIVE }}\)\(\small{\text{NUMBER}}\normalsize.\) Now all possible outcomes have been accounted for in the descriptions.
We must think carefully about our sample space and correctly identify all outcomes which are not in our event. As another example, we might have the situation of randomly selecting an FHSU student with a focus on the outcome of \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{THREE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\). The complement is the outcome of \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{TWO }}\)\(\small{\text{OR }}\)\(\small{\text{FEWER }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\) or, equivalently, \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{FEWER }}\)\(\small{\text{THAN }}\)\(\small{\text{THREE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\). Notice how the use of mathematical notation can help us here. We might represent the outcome of \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{THREE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\) more briefly in notation as \(x \ge 3.\) Then, in the context of counting number of shots, the complement is \(x < 3,\) which leads to a proper complement description of \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{FEWER }}\)\(\small{\text{THAN }}\)\(\small{\text{THREE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize.\)
This leads to special cases that commonly cause problems in complement descriptions; specifically, events involving "at least," "at most," "all' or "none." The complement of "all are" is not "none are," but is instead "at least one is not." For example, the complement of \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{ALL }}\)\(\small{\text{AVAILABLE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\) would be \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{MISSED }}\)\(\small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{ONE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOT}}\normalsize\). The complement of \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{NONE }}\)\(\small{\text{OF }}\)\(\small{\text{THE }}\)\(\small{\text{AVAILABLE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\) would be \(\small{\text{STUDENT }}\)\(\small{\text{HAS }}\)\(\small{\text{HAD }}\)\(\small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{ONE }}\)\(\small{\text{OF }}\)\(\small{\text{THE }}\)\(\small{\text{AVAILABLE }}\)\(\small{\text{COVID }}\)\(\small{\text{VACCINE }}\)\(\small{\text{SHOTS}}\normalsize\). Note that the complement of "none are" is "at least one is." We must think carefully when dealing with complements of event claims involving such keywords.
Give written descriptions of the complements of each event described below
- The event: \(\small{\text{IT }}\)\(\small{\text{SNOWS }}\)\(\small{\text{ON }}\)\(\small{\text{CHRISTMAS }}\)\(\small{\text{DAY}}\normalsize\)
- Answer
-
The complement event description would be \(\small{\text{IT }}\)\(\small{\text{DOES }}\)\(\small{\text{NOT }}\)\(\small{\text{SNOW }}\)\(\small{\text{ON }}\)\(\small{\text{CHRISTMAS }}\)\(\small{\text{DAY}}\normalsize\). We note that \(\small{\text{IT }}\)\(\small{\text{RAINS }}\)\(\small{\text{ON }}\)\(\small{\text{CHRISTMAS }}\)\(\small{\text{DAY}}\normalsize\) is an outcome that fits in the complement description, but is itself not the actual full complement since other outcomes exist in the complement besides rain.
- The event: \(\small{\text{NATASHA }}\)\(\small{\text{IS }}\)\(\small{\text{LESS }}\)\(\small{\text{THAN }}\normalsize 10 \small{\text{ MINUTES }}\)\(\small{\text{LATE}}\normalsize\)
- Answer
-
Notice the given event description of less than \(10\) minutes can be represented mathematically as \(x < 10,\) informing us that the complement must be related to \(x \ge 10.\) Therefore, the complement event description would be \(\small{\text{NATASHA }}\)\(\small{\text{IS }}\)\(\small{\text{AT }}\)\(\small{\text{LEAST }}\normalsize 10 \small{\text{ MINUTES }}\)\(\small{\text{LATE}}\normalsize,\) or equivalently, \(\small{\text{NATASHA }}\)\(\small{\text{IS }}\)\(\small{\text{LATE }}\)\(\small{\text{BY }}\)\(\small{\text{AT }}\)\(\small{\text{LEAST }}\normalsize 10 \small{\text{ MINUTES}}\normalsize.\) The phrasing we might give can vary, but the meaning of the phrasing must be tied to the inequality \( x \ge 10.\)
- The event: \(\small{\text{ALL }}\)\(\small{\text{CARDS }}\)\(\small{\text{IN }}\)\(\small{\text{A }}\)\(\small{\text{POKER }}\)\(\small{\text{HAND }}\)\(\small{\text{ARE }}\)\(\small{\text{FACE }}\)\(\small{\text{CARDS}}\normalsize\) (Click here for a full description and visualization of a standard deck of playing cards)
- Answer
-
The given event description of "all cards" is complemented by "at least one is not." So the complement event description would be \(\small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{ONE }}\)\(\small{\text{CARD }}\)\(\small{\text{IS }}\)\(\small{\text{NOT }}\)\(\small{\text{A }}\)\(\small{\text{FACE }}\)\(\small{\text{CARD}}\normalsize.\) We must think clearly about this complement and also note why the description \(\small{\text{NONE }}\)\(\small{\text{OF }}\)\(\small{\text{THE }}\)\(\small{\text{CARDS }}\)\(\small{\text{ARE }}\)\(\small{\text{FACE }}\)\(\small{\text{CARDS}}\normalsize\) is not the complement description. We could have one, two, three, or four cards that are not face cards as possible events in the complement.
- The event: \(\small{\text{NONE }}\)\(\small{\text{OF }}\)\(\small{\text{THE }}\)\(\small{\text{STUDENTS }}\)\(\small{\text{FAILED }}\)\(\small{\text{THE }}\)\(\small{\text{LAST }}\)\(\small{\text{EXAM}}\normalsize\)
- Answer
-
The given event description of "none failed" is complemented by "at least one did fail." So the complement event description would be \(\small{\text{AT }}\)\(\small{\text{LEAST }}\)\(\small{\text{ONE }}\)\(\small{\text{STUDENT }}\)\(\small{\text{FAILED }}\)\(\small{\text{THE }}\)\(\small{\text{LAST }}\)\(\small{\text{EXAM}}.\)
Summary
To review this section, we list several important facts to remember when working with probabilities:
- An outcome of a random experiment is any potential result of the experiment. An event is a set of outcomes one might expect from a given random experiment. The sample space is the collection of all possible outcomes in a given situation.
- The probability of an event \(A\) is denoted by \(P \left( A \right)\) with the condition that \(0 \le P \left( A \right) \le 1.\) If \(P \left( A \right) \le 5\%,\) we will currently consider the event as unusual.
- The sum of the probabilities for all the possible outcomes in the sample space will always total to \( 1\) \(= 100\%.\)
- Multiple methods may be used for computing probability values. We have discussed the methods of "subjective/intuition," "classical for equally likely simple events," and "empirical/experimental/relative frequency." Some methods only produce estimations of the actual probability value.
- The complement of an event \(A\) is denoted by \(\bar{A}\) and must contain all outcomes of the sample space that are not part of the given event. As such, we have the probability benefit of \( P \left( A \right) + P \left( \bar{A} \right) =1 \) that can be used to find the probability of an event if we know the probability of the complement.