4: Probability Theory

Last updated
Save as PDF

Page ID: 7803

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We want to imagine doing an experiment in which there is no way to predict what the outcome will be. Of course, if we stop our imagination there, there would be nothing we could say and no point in trying to do any further analysis: the outcome would just be whatever it wanted to be, with no pattern.

So let us add the additional assumption that while we cannot predict what will happen any particular time we do the experiment, we can predict general trends, in the long run, if we repeat the experiment many times. To be more precise, we assume that, for any collection \(E\) of possible outcomes of the experiment there is a number \(p(E)\) such that, no matter who does the experiment, no matter when they do it, if they repeat the experiment many times, the fraction of times they would have seen any of the outcomes of \(E\) would be close to that number \(p(E)\).

This is called the frequentist approach to the idea of probability. While it is not universally accepted – the Bayesian alternative does in fact have many adherents – it has the virtue of being the most internally consistent way of building a foundation for probability. For that reason, we will follow the frequentist description of probability in this text.

Before we jump into the mathematical formalities, we should motivate two pieces of what we just said. First, why talk about sets of outcomes of the experiment instead of talking about individual outcomes? The answer is that we are often interested in sets of outcomes, as we shall see later in this book, so it is nice to set up the machinery from the very start to work with such sets. Or, to give a particular concrete example, suppose you were playing a game of cards and could see your hand but not the other players’ hands. You might be very interested in how likely is it that your hand is a winning hand, i.e., what is the likelihood of the set of all possible configurations of all the rest of the cards in the deck and in your opponents’ hands for which what you have will be the winning hand? It is situations like this which motivate an approach based on sets of outcomes of the random experiment.

Another question we might ask is: where does our uncertainty about the experimental results come from? From the beginnings of the scientific method through the turn of the \(20^{th}\) century, it was thought that this uncertainty came from our incomplete knowledge of the system on which we were experimenting. So if the experiment was, say, flipping a coin, the precise amount of force used to propel the coin up into the air, the precise angular motion imparted to the coin by its position just so on the thumbnail of the person doing the flipping, the precise drag that the coin felt as it tumbled through the air caused in part by eddies in the air currents coming from the flap of a butterfly’s wings in the Amazon rainforest – all of these things could significantly contribute to changing whether the coin would eventually come up heads or tails. Unless the coin-flipper was a robot operating in a vacuum, then, there would just be no way to know all of these physical details with enough accuracy to predict the toss.

After the turn of the \(20^{th}\) century, matters got even worse (at least for physical determinists): a new theory of physics came along then, called Quantum Mechanics, according to which true randomness is built into the laws of the universe. For example, if you have a very dim light source, which produces the absolutely smallest possible “chunks” of light (called photons), and you shine it through first one polarizing filter and then see if it goes through a second filter at a \(45^\circ\) angle to the first, then half the photons will get through the second filter, but there is absolutely no way ever to predict whether any particular photon will get though or not. Quantum mechanics is full of very weird, non-intuitive ideas, but it is one of the most well-tested theories in the history of science, and it has passed every test.