Skip to main content
Statistics LibreTexts

3.3: Spread and Variability

  • Page ID
    42009
    • Linda R. Cote, Rupa G. Gordon, Chrislyn E. Randell, Judy Schmitt, and Helena Marvin
    • University of Missouri System

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Variability refers to how “spread out” a group of scores is. To see what we mean by spread out, consider the graphs in Figure \(\PageIndex{1}\). These graphs represent the scores on two quizzes. The mean score for each quiz is 7.0. Despite the equality of means, you can see that the distributions are quite different. Specifically, the scores on Quiz 1 are more densely packed, and those on Quiz 2 are more spread out. The differences among students were much greater on Quiz 2 than on Quiz 1.

    Two bar charts compare the frequencies of scores for Group 1 and Group 2. Group 1 scores peak at 7, while Group 2 scores are more evenly distributed.
    Figure \(\PageIndex{1}\): Bar charts of Quizzes 1 and 2. (“Quiz Score Bar Charts” by Judy Schmitt is licensed under CC BY-NC-SA 4.0.)

    The terms variability, spread, and dispersion are synonyms and refer to how spread out a distribution is. Just as in the section on central tendency, where we discussed measures of the center of a distribution of scores, in this section, we will discuss measures of the variability of a distribution. There are three frequently used measures of variability: range, variance, and standard deviation. In the next few paragraphs, we will look at each of these measures of variability in more detail.

    Range

    The range is the simplest measure of variability to calculate, and one you have probably encountered many times in your life. The range is simply the highest score minus the lowest score. Let’s take a few examples. What is the range of the following group of numbers: 10, 2, 5, 6, 7, 3, 4? Well, the highest number is 10, and the lowest number is 2, so 10 − 2 = 8. The range is 8. Let’s take another example. Here’s a dataset with 10 numbers: 99, 45, 23, 67, 45, 91, 82, 78, 62, 51. What is the range? The highest number is 99 and the lowest number is 23, so 99 − 23 = 76; the range is 76. Now, consider the two quizzes shown in Figure \(\PageIndex{1}\). On Quiz 1, the lowest score is 5 and the highest score is 9. Therefore, the range is 4. The range on Quiz 2 was larger: the lowest score was 4, and the highest score was 10. Therefore, the range is 6.

    The problem with using range is that it is extremely sensitive to outliers, and one number far away from the rest of the data will greatly alter the value of the range. For example, in the set of numbers 1, 3, 4, 4, 5, 8, and 9, the range is 8 (9 − 1). However, if we add a single person whose score is nowhere close to the rest of the scores, say, 20, the range more than doubles from 8 to 19.

    Interquartile Range

    The interquartile range (IQR) is the range of the middle 50% of the scores in a distribution and is sometimes used to communicate where the bulk of the data in the distribution are located. It is computed as follows:

    \[
    \Large
    \text{IQR}=75^\text{th} \text{percentile} -2 5^\text{th} \text{percentile}
    \nonumber
    \]

    For Quiz 1, the 75th percentile is 8 and the 25th percentile is 6. The interquartile range is therefore 2. For Quiz 2, which has a greater spread, the 75th percentile is 9, the 25th percentile is 5, and the interquartile range is 4. Recall that in the discussion of box plots, the 75th percentile was called the upper hinge, and the 25th percentile was called the lower hinge. Using this terminology, the interquartile range is referred to as the H-spread.

    Sum of Squares

    Variability can also be defined in terms of how close the scores in the distribution are to the middle of the distribution. Using the mean as the measure of the middle of the distribution, we can see how far, on average, each data point is from the center. The data from Quiz 1 are shown in Table \(\PageIndex{1}\).

    There are a few things to note about how Table \(\PageIndex{1}\) is formatted. The raw data scores (X) are always placed in the left-most column. This column is then summed at the bottom (ΣX) to facilitate calculating the mean by dividing the sum of X values by the number of scores in the table (N). The mean score is 7.0 (ΣX/N = 140/20 = 7.0). Once you have the mean, you can easily work your way down the second column calculating the deviation scores (X – M), representing how far each score deviates from the mean, here calculated as the score (X value) minus 7. This column is also summed and has a very important property: it will always sum to 0, or close to zero if you have rounding error due to many decimal places (Σ(X – M) = 0). This step is used as a check on your math to make sure you haven’t made a mistake. If this column sums to 0, you can move on to filling in the third column, which is composed of the squared deviation scores. The deviation scores are squared to remove negative values and appear in the third column \((X-M)^2\). When these values are summed, you have the sum of the squared deviations, or the sum of squares (SS), calculated with the formula \(SS=\sum{(X-M)^2}\).

    Table \(\PageIndex{1}\): Calculation of variance for Quiz 1 scores.
    X (X − M) (X − M)2 X2
    9 2 4 81
    9 2 4 81
    9 2 4 81
    8 1 1 64
    8 1 1 64
    8 1 1 64
    8 1 1 64
    7 0 0 49
    7 0 0 49
    7 0 0 49
    7 0 0 49
    7 0 0 49
    6 −1 1 36
    6 −1 1 36
    6 −1 1 36
    6 −1 1 36
    6 −1 1 36
    6 −1 1 36
    5 −2 4 25
    5 −2 4 25

    \(\sum{X}=140\)
    \((\sum{X})^2=19,600\)

    \(\sum{(X-M)}=0\)

    \(\sum{(X-M)^2}=30\)

    \(\sum{X^2}=1,010\)

    \[SS=4+4+4+1+1+1+1+0+0+0\\
    +0+0+1+1+1+1+1+1+4+4\\
    =30 \nonumber \]

    The preceding formula is called the definitional formula, as it shows the logic behind the sum of squared deviations calculation. As mentioned earlier, there can be rounding errors in calculating the deviation scores. Also, when the set of scores is large, calculating the deviation scores, squaring the scores, and then summing those values can be tedious. To simplify the sum of squares calculation, the computational formula is used instead. The computational formula is as follows:

    \[
    \Large
    SS=\sum{X^2}-\frac{(\sum{X})^2}{N}
    \nonumber \]

    The last column in Table \(\PageIndex{1}\) represents the X values squared and then summed—\(\sum{X^2}\). At the bottom of the first column, the \(\sum{X}\) value is squared­—\((\sum{X})^2\). These are the values used in the computational formula for the sum of squares. As you can see in the calculation below, the SS value is the same for both the definitional formula and the computational formula:

    \[
    \Large
    \begin{aligned}
    SS=&\sum{X^2}-\frac{(\sum{X})^2}{N}\\
    =&1,000-\frac{19,600}{20}\\
    =&30
    \end{aligned}
    \nonumber \]

    As we will see, the sum of squares appears again and again in different formulas—it is a very important value, and using the X and \(X^2\) columns in this table makes it simple to calculate the SS without error.

    Variance

    Now that we have the sum of squares calculated, we can use it to compute our formal measure of average distance from the mean—the variance. The variance is defined as the average squared difference of the scores from the mean. We square the deviation scores because, as we saw in the second column of Table \(\PageIndex{1}\), the sum of raw deviations is always 0, and there’s nothing we can do mathematically without changing that.

    The population parameter for variance is s2 (“sigma-squared”) and is calculated as:

    \[
    \Large
    \sigma^2=\frac{SS}{N}
    \nonumber \]

    We can use the value we previously calculated for SS in the numerator, then simply divide that value by N to get the variance. If we assume that the values in Table \(\PageIndex{1}\) represent the full population, then we can take our value of the sum of squares and divide it by N to get our population variance:

    \[
    \Large
    \sigma^2=\frac{30}{20}=1.5
    \nonumber \]

    So, on average, scores in this population are 1.5 squared units away from the mean. This measure of spread exhibits much more robustness (a term used by statisticians to mean resilience or resistance to outliers) than the range, so it is a much more useful value to compute. Additionally, as we will see in future chapters, variance plays a central role in inferential statistics.

    The sample statistic used to estimate the variance is \(s^2\) (“s-squared”):

    \[
    \Large
    s^2=\frac{SS}{N-1}=\frac{SS}{df}
    \nonumber \]

    This formula is very similar to the formula for the population variance with one change: we now divide by N − 1 instead of N. The value N − 1 has a special name: the degrees of freedom (abbreviated as df). You don’t need to understand in depth what degrees of freedom are (essentially, they account for the fact that we have to use a sample statistic to estimate the mean [M] before we estimate the variance) in order to calculate variance, but knowing that the denominator is called df provides a nice shorthand for the variance formula:

    \[
    \Large
    \frac{SS}{df}
    \nonumber \]

    Going back to the values in Table \(\PageIndex{1}\) and treating those scores as a sample, we can estimate the sample variance as:

    \[
    \Large
    s^2=\frac{30}{20-1}=1.58
    \nonumber \]

    Notice that this value is slightly larger than the one we calculated when we assumed these scores were the full population. This is because our value in the denominator is slightly smaller, making the final value larger. In general, as your sample size N gets bigger, the effect of subtracting 1 becomes less and less. Comparing a sample size of 10 to a sample size of 1000, 10 − 1 = 9, or 90% of the original value, whereas 1000 − 1 = 999, or 99.9% of the original value. Thus, larger sample sizes will bring the estimate of the sample variance closer to that of the population variance. This is a key idea and principle in statistics that we will see over and over again: larger sample sizes better reflect the population.

    Standard Deviation

    The standard deviation is simply the square root of the variance. This is a useful and interpretable statistic because taking the square root of the variance (recalling that variance is the average squared difference) puts the standard deviation back into the original units of the measure we used. Thus, when reporting descriptive statistics in a study, scientists virtually always report mean and standard deviation. Standard deviation is therefore the most commonly used measure of spread for our purposes, representing the average distance of the scores from the mean.

    The population parameter for standard deviation is \(\sigma\) (“sigma”), which, intuitively, is the square root of the variance parameter \(\sigma^2\) (occasionally, the symbols work out nicely that way). The formula is simply the formula for variance under a square root sign:

    \[
    \Large
    \sigma=\sqrt{\frac{SS}{N}}=\sqrt{\frac{30}{20}}=\sqrt{1.5}=1.22
    \nonumber \]

    The sample statistic follows the same conventions and is given as s in mathematical formulas. (Note that in American Psychological Association [APA] format for reporting results, sample standard deviation is reported using the abbreviation SD.)

    \[
    \Large
    s=\sqrt{\frac{SS}{df}}=\sqrt{\frac{30}{20-1}}=\sqrt{1.58}=1.26
    \nonumber \]

    The standard deviation is an especially useful measure of variability when the distribution is normal or approximately normal because the proportion of the distribution within a given number of standard deviations from the mean can be calculated. For example, 68% of the distribution is within one standard deviation (above and below) of the mean, and approximately 95% of the distribution is within two standard deviations of the mean, as shown in Figure \(\PageIndex{2}\). Therefore, if you had a normal distribution with a mean of 50 and a standard deviation of 10, then 68% of the distribution would be between 50 − 10 = 40 and 50 + 10 = 60. Similarly, about 95% of the distribution would be between 50 − 2 × 10 = 30 and 50 + 2 × 10 = 70.

    A bell curve shows shaded regions for 68%, 95%, and 99.7% within one, two, and three standard deviations from the mean.
    Figure \(\PageIndex{2}\): Percentages of the normal distribution. (“Normal Distribution Percentages” by Judy Schmitt is licensed under CC BY-NC-SA 4.0.)

    Figure \(\PageIndex{3}\) shows two normal distributions. The red (left-most) distribution has a mean of 40 and a standard deviation of 5; the blue (right-most) distribution has a mean of 60 and a standard deviation of 10. For the red distribution, 68% of the distribution is between 45 and 55; for the blue distribution, 68% is between 50 and 70. Notice that as the standard deviation gets smaller, the distribution becomes much narrower, regardless of where the center of the distribution (mean) is. Figure \(\PageIndex{4}\) presents several more examples of this effect.

    Two bell-shaped curves overlap; the red curve is taller and centered around 40, while the blue dashed curve is shorter and centered around 55 on the x-axis.
    Figure \(\PageIndex{3}\): Normal distributions with standard deviations of 5 and 10. (“Normal Distributions with Standard Deviations” by Judy Schmitt is licensed under CC BY-NC-SA 4.0.)
    Four graphs show student frequency distributions: (A) two identical sets; (B) locations differ; (C) variabilities differ; (D) locations and variabilities differ.
    Figure \(\PageIndex{4}\): Differences between two datasets. (“Location and Variability Differences” by Judy Schmitt is licensed under CC BY-NC-SA 4.0.)
    Video: Measures of Spread

    Measures of Spread on YouTube.

    Test Your Knowledge

    Question \(\PageIndex{1}\)

    Question \(\PageIndex{2}\)

    Question \(\PageIndex{3}\)


    This page titled 3.3: Spread and Variability is shared under a not declared license and was authored, remixed, and/or curated by Linda R. Cote, Rupa G. Gordon, Chrislyn E. Randell, Judy Schmitt, and Helena Marvin via source content that was edited to the style and standards of the LibreTexts platform.