# 5: Probability

- Page ID
- 2109

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

A dataset with two variables contains what is called bivariate data. This chapter discusses ways to describe the relationship between two variables. For example, you may wish to describe the relationship between the heights and weights of people to determine the extent to which taller people weigh more. The introductory section gives more examples of bivariate relationships and presents the most common way of portraying these relationships graphically. The next five sections discuss Pearson's correlation, the most common index of the relationship between two variables. The final section, "Variance Sum Law II," makes use of Pearson's correlation to generalize this law to bivariate data.

- 5.1: The Concept of "Probability"
- Inferential statistics is built on the foundation of probability theory, and has been remarkably successful in guiding opinion about the conclusions to be drawn from data. Yet (paradoxically) the very idea of probability has been plagued by controversy from the beginning of the subject to the present day. In this section we provide a glimpse of the debate about the interpretation of the probability concept.

- 5.2: Basic Concepts of Probability
- Probability is an important and complex field of study. Fortunately, only a few basic issues in probability theory are essential for understanding statistics at the level covered in this book. These basic issues are covered in this chapter.

- 5.3: Conditional Probability Demonstration
- The simulation demonstrates how different conditional probabilities would be calculated from the provided data.

- 5.4: Gambler's Fallacy
- The gambler's fallacy involves beliefs about sequences of independent events. By definition, if two events are independent, the occurrence of one event does not affect the occurrence of the second. The gambler's fallacy is mistakenly believing that certain events are dependent.

- 5.5: Permutations and Combinations
- This section covers basic formulas for determining the number of various possible types of outcomes. The topics covered are: (1) counting the number of possible orders, (2) counting using the multiplication rule, (3) counting the number of permutations, and (4) counting the number of combinations.

- 5.6: Birthday Demo
- Most people's intuition about the probability of two people in a group sharing a birthday is way off. This simulation allows you to approach this problem concretely.

- 5.7: Binomial Distribution
- In the present section, we consider probability distributions for which there are just two possible outcomes with fixed probabilities summing to one. These distributions are called binomial distributions.

- 5.8: Binomial Demonstration
- This demonstration allows you to explore the binomial distribution.

- 5.9: Poisson Distribution
- The Poisson distribution can be used to calculate the probabilities of various numbers of "successes" based on the mean number of successes.

- 5.10: Multinomial Distribution
- The multinomial distribution can be used to compute the probabilities in situations in which there are more than two possible outcomes.

- 5.11: Hypergeometric Distribution
- The hypergeometric distribution is used to calculate probabilities when sampling without replacement.

- 5.12: Base Rates
- Compute the probability of a condition from hits, false alarms, and base rates using a tree diagram. Compute the probability of a condition from hits, false alarms, and base rates using Bayes' Theorem

- 5.13: Bayes Demo
- This demonstration lets you examine the effects of base rate, true positive rate, and false positive rate on the probability that a person diagnosed with disease X actually has the disease. The base rate is the proportion of people who have the disease. The true positive rate is the probability that a person with the disease will test positive. The false positive rate is the probability that someone who does not have the disease will test positive.

- 5.14: Monty Hall Problem
- In the Monty Hall game, a contestant is shown three doors. Two of the doors have goats behind them and one has a car. The contestant chooses a door. Before opening the chosen door, Monty Hall opens a door that has a goat behind it. The contestant can then switch to the other unopened door, or stay with the original choice.