10.1: What Is Probability?
 Page ID
 8764
Informally, we usually think of probability as a number that describes the likelihood of some event occurring, which ranges from zero (impossibility) to one (certainty). Sometimes probabilities will instead be expressed in percentages, which range from zero to one hundred, as when the weather forecast predicts a twenty percent chance of rain today. In each case, these numbers are expressing how likely that particular event is, ranging from absolutely impossible to absolutely certain.
To formalize probability theory, we first need to define a few terms:
 An experiment is any activity that produces or observes an outcome. Examples are flipping a coin, rolling a 6sided die, or trying a new route to work to see if it’s faster than the old route.
 The sample space is the set of possible outcomes for an experiment. We represent these by listing them within a set of squiggly brackets. For a coin flip, the sample space is {heads, tails}. For a sixsided die, the sample space is each of the possible numbers that can appear: {1,2,3,4,5,6}. For the amount of time it takes to get to work, the sample space is all possible real numbers greater than zero (since it can’t take a negative amount of time to get somewhere, at least not yet). We won’t bother trying to write out all of those numbers within the brackets.
 An event is a subset of the sample space. In principle it could be one or more of possible outcomes in the sample space, but here we will focus primarily on elementary events which consist of exactly one possible outcome. For example, this could be obtaining heads in a single coin flip, rolling a 4 on a throw of the die, or taking 21 minutes to get home by the new route.
Now that we have those definitions, we can outline the formal features of a probability, which were first defined by the Russian mathematician Andrei Kolmogorov. These are the features that a value has to have if it is going to be a probability. If $P(X_i)$ is the probability of event $$
 Probability cannot be negative: $P(X_i) \ge 0$

The total probability of all outcomes in the sample space is 1; that is, if we take the probability of each element in the sample space and add them up, they must sum to 1. We can express this using the summation symbol ∑:
\(\sum_{i=1}^{N} P\left(X_{i}\right)=P\left(X_{1}\right)+P\left(X_{2}\right)+\ldots+P\left(X_{N}\right)=1\)
This is interpreted as saying “Take all of the N elementary events, which we have labeled from 1 to N, and add up their probabilities. These must sum to one.”
 The probability of any individual event cannot be greater than one: P(Xi)≤1. This is implied by the previous point; since they must sum to one, and they can’t be negative, then any particular probability must be less than or equal to one.