# 1.3: Probability

Skills to Develop

The basic idea of a statistical test is to identify a null hypothesis, collect some data, then estimate the probability of getting the observed data if the null hypothesis were true. If the probability of getting a result like the observed one is low under the null hypothesis, you conclude that the null hypothesis is probably not true. It is therefore useful to know a little about probability.

One way to think about probability is as the proportion of individuals in a population that have a particular characteristic. The probability of sampling a particular kind of individual is equal to the proportion of that kind of individual in the population. For example, in fall 2013 there were $$22,166$$ students at the University of Delaware, and $$3,679$$ of them were graduate students. If you sampled a single student at random, the probability that they would be a grad student would be $$\frac{3,679}{22,166}$$ or $$0.166$$. In other words, $$16.6\%$$ of students were grad students, so if you'd picked one student at random, the probability that they were a grad student would have been $$16.6\%$$.

When dealing with probabilities in biology, you are often working with theoretical expectations, not population samples. For example, in a genetic cross of two individual Drosophila melanogaster that are heterozygous at the vestigial locus, Mendel's theory predicts that the probability of an offspring individual being a recessive homozygote (having teeny-tiny wings) is one-fourth, or $$0.25$$. This is equivalent to saying that one-fourth of a population of offspring will have tiny wings.

## Multiplying probabilities

You could take a semester-long course on mathematical probability, but most biologists just need to know a few basic principles. You calculate the probability that an individual has one value of a nominal variable AND another value of a second nominal variable by multiplying the probabilities of each value together.

For example, if the probability that a Drosophila in a cross has vestigial wings is one-fourth, AND the probability that it has legs where its antennae should be is three-fourths, the probability that it has vestigial wings AND leg-antennae is one-fourth times three-fourths, or $$0.25\times 0.75$$, or $$0.1875$$. This estimate assumes that the two values are independent, meaning that the probability of one value is not affected by the other value. In this case, independence would require that the two genetic loci were on different chromosomes, among other things.

The probability that an individual has one value OR another, MUTUALLY EXCLUSIVE, value is found by adding the probabilities of each value together. "Mutually exclusive" means that one individual could not have both values. For example, if the probability that a flower in a genetic cross is red is one-fourth, the probability that it is pink is one-half, and the probability that it is white is one-fourth, then the probability that it is red OR pink is one-fourth plus one-half, or three-fourths.

## More complicated situations

When calculating the probability that an individual has one value OR another, and the two values are NOT MUTUALLY EXCLUSIVE, it is important to break things down into combinations that are mutually exclusive. For example, let's say you wanted to estimate the probability that a fly from the cross above had vestigial wings OR leg-antennae. You could calculate the probability for each of the four kinds of flies: normal wings/normal antennae ($$0.75\times 0.25=0.1875$$), normal wings/leg-antennae ($$0.75\times 0.75=0.5625$$), vestigial wings/normal antennae ($$0.25\times 0.25=0.0625$$), and vestigial wings/leg-antennae ($$0.25\times 0.75=0.1875$$). Then, since the last three kinds of flies are the ones with vestigial wings or leg-antennae, you'd add those probabilities up ($$0.5625+0.0625+0.1875=0.8125$$).

## When to calculate probabilities

While there are some kind of probability calculations underlying all statistical tests, it is rare that you'll have to use the rules listed above. About the only time you'll actually calculate probabilities by adding and multiplying is when figuring out the expected values for a goodness-of-fit test.

## Contributor

• John H. McDonald (University of Delaware)