# 10.9: Independence

The term “independent” has a very specific meaning in statistics, which is somewhat different from the common usage of the term. Statistical independence between two variables means that knowing the value of one variable doesn’t tell us anything about the value of the other. This can be expressed as:

$P(A|B) = P(A)$

That is, the probability of A given some value of B is just the same as the overall probability of A. Looking at it this way, we see that many cases of what we would call “independence” in the world are not actually statistically independent. For example, there is currently a move by a small group of California citizens to declare a new independent state called Jefferson, which would comprise a number of counties in northern California and Oregon. If this were to happen, then the probability that a current California resident would now live in the state of Jefferson would be $P(\text{Jefferson})=0.014$, whereas the proability that they would remain a California resident would be $P(\text{California})=0.986$. The new states might be politically independent, but they would not be statistically independent, because $P(\text{California|Jefferson}) = 0$! That is, while independence in common language often refers to sets that are exclusive, statistical independence refers to the case where one cannot predict anything about one variable from the value of another variable. For example, knowing a person’s hair color is unlikely to tell you whether they prefer chocolate or strawberry ice cream.

The overall probability of bad mental health $P(\text{bad mental health})$ is 0.16 while the conditional probability $P(\text{bad mental health|physically active})$ is 0.13. Thus, it seems that the conditional probability is somewhat smaller than the overall probability, suggesting that they are not independent, though we can’t know for sure just by looking at the numbers, since these numbers might be different due to sampling variability. Later in the course we will encounter tools that will let us more directly test whether two variables are independent.