2.2: Measures of Spread- Range, Variance, and Standard Deviation
- Page ID
- 58863
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Measures of Spread: Range and Standard Deviation
In the previous section, we learned how to describe what's typical in a dataset using measures of center. But that doesn’t tell us everything. Two datasets can have the same average while behaving very differently overall.
Measures of spread help us describe how much the values in a dataset vary or deviate from the center. They show how consistent your data is, how wide it is spread out, and how dependable a summary like the mean or median will be in representing the whole.
In this section, you’ll learn three common ways to measure spread:
- Range – the simple difference between the highest and lowest values
- Standard Deviation – the average distance each value is from the mean
Framing Example: Pizza Delivery Times
Imagine you're comparing two local pizza restaurants that advertise “fast delivery.” You decide to test them and record the delivery times (in minutes) for 10 orders from each shop:
Shop A: 27, 28, 28, 29, 29, 30, 30, 30, 31, 31 Shop B: 20, 22, 25, 28, 30, 32, 35, 38, 40, 45
Both shops might have similar averages, but one shows much more consistent delivery times. After reviewing the three cards below, you’ll come back to this example and calculate how spread helps you choose the more reliable option.
📏 Range
Description: The range is the difference between the largest and smallest values in a dataset. It gives a quick sense of how wide the data is spread.
Formula:
\[ \text{Range} = \text{Maximum} - \text{Minimum} \]
Notes:
- Very sensitive to outliers
- Use for a quick check, but not reliable for distributions with extreme values
Example: Dataset: \(\{2, 4, 5, 6, 10\}\) → Range = \(10 - 2 = 8\)
📉 Standard Deviation (SD)
Description: The standard deviation measures how spread out values are in a dataset. Specifically, it tells us how far each value typically is from the mean.
Formula (sample data):
\[ s = \sqrt{ \frac{\sum (x_i - \bar{x})^2}{n - 1} } \]
Symbols:
- \( x_i \): Individual data value
- \( \bar{x} \): Mean of the sample
- \( n \): Number of values in the sample
Notes:
- Very sensitive to outliers
- Gives a complete view of spread for symmetric, bell-shaped data
How to Calculate Standard Deviation (by hand)
To help you understand the formula, let’s break it down into six clear, repeatable steps:
- Calculate the mean \( \bar{x} \)
- Find the difference between each data point and the mean (\( x_i - \bar{x} \))
- Square each difference: \( (x_i - \bar{x})^2 \)
- Add up all the squared differences: \( \sum (x_i - \bar{x})^2 \)
- Divide by \( n-1 \) (sample) or \( n \) (population) → this gives you the variance
- Take the square root of the variance → this gives you the standard deviation
Example: Small Dataset
Let’s follow the steps for the dataset: {4, 6, 8, 10}
- Mean: \[ \bar{x} = \frac{4 + 6 + 8 + 10}{4} = \frac{28}{4} = 7 \]
- Differences from the mean: \[ (4 - 7) = -3,\quad (6 - 7) = -1,\quad (8 - 7) = 1,\quad (10 - 7) = 3 \]
- Squares of differences: \[ 9,\ 1,\ 1,\ 9 \]
- Sum of squares: \[ 9 + 1 + 1 + 9 = 20 \]
- Divide by \( n - 1 = 3 \) (sample): \[ \frac{20}{3} ≈ 6.67 \] This is the variance
- Take the square root: \[ s = \sqrt{6.67} ≈ 2.58 \]
Interpretation: The values in this dataset vary, on average, about 2.58 units from the mean.
Reminder:
When you divide by \( n - 1 \) instead of \( n \), you're using the Bessel’s correction(opens in new window). It makes the sample standard deviation a better estimate of the population value — we’ll mostly use the sample version in this course.
In this course, we’ll primarily use the sample standard deviation, because we usually work with data collected from a sample, not an entire population.
Analyzing the Pizza Delivery Example
Now let’s return to our earlier question: Which pizza shop is more consistent with delivery times?
From first glance, both shops might appear to deliver in roughly the same amount of time. But measures of spread can reveal important differences in reliability that basic averages might miss.
Here are the two sets of delivery times again:
Shop A: 27, 28, 28, 29, 29, 30, 30, 30, 31, 31 Shop B: 20, 22, 25, 28, 30, 32, 35, 38, 40, 45
Let’s compute the summary statistics for each shop. These include:
- Measures of Center — Mean, Median, Mode
- Measures of Spread — Range, Standard Deviation, Interquartile Range (IQR, discussed in the next section)
- Additional values — Minimum and Maximum
Summary Statistics Table
| Statistic | Shop A | Shop B |
|---|---|---|
| Mean | 29.3 | 31.5 |
| Median | 29.5 | 31.0 |
| Mode | 30 | None |
| Range | 31 − 27 = 4 | 45 − 20 = 25 |
| Standard Deviation | ~1.4 | ~7.9 |
| Minimum | 27 | 20 |
| Maximum | 31 | 45 |
Interpretation
The results confirm what we sensed visually:
- Shop A's delivery times are clustered tightly around 29–30 minutes it's highly consistent, with low variation (small range, small standard deviation).
- Shop B, while having a similar median and mean, shows much wider delivery times overall from 20 to 45 minutes. It has a larger range and a much higher standard deviation, indicating less reliability.
This helps us answer our original question: Shop A is the more consistent choice. Even though both shops may advertise “fast” delivery, Shop A's low variability makes it more predictable an important factor for customers deciding where to order from.
This example shows how measures of spread work alongside measures of center to tell the full story. In real-world data, it's rarely enough to just look at one number like the mean without understanding context, consistency, and variation.
Z-Scores: Measuring Distance from the Mean
Now that you understand standard deviation, we can use it to introduce one of the most useful ideas in statistics: the z-score.
A z-score tells you how many standard deviations a value is from the mean. It helps compare values from different datasets or distributions and identifies whether a specific value is typical or unusual.
Formula for Z-Score:
\[ z = \frac{x - \bar{x}}{s} \]
\( x \): the individual data value
\( \bar{x} \): the sample mean
\( s \): the sample standard deviation
What Z-Scores Tell Us:
- A z-score of 0 means the value is exactly at the mean.
- A positive z-score means the value is above the mean.
- A negative z-score means the value is below the mean.
Because z-scores are based on standard deviation, they're a useful way to compare apples to oranges or pizza deliveries to grocery bills. For example, a delivery time with a z-score of 1.5 is 1.5 standard deviations longer than the typical delivery time.
We'll use z-scores often in future chapters when working with normal distributions, confidence intervals, and hypothesis tests. For now, just remember that z-scores use the mean and standard deviation to put values into context.
Calculating Summary Statistics in Excel
Once you understand how to compute each of these values by hand, Excel can make these calculations much faster and easier especially when working with larger datasets, like your semester-long project. Below is a list of spreadsheet functions you can use to compute measures of spread (and center) directly within Excel or Google Sheets.
Excel Formulas for Summary Statistics
| Statistic | Excel Formula | How It Works |
|---|---|---|
| Mean | =AVERAGE(A2:A11) |
Averages the values in cells A2 through A11 |
| Median | =MEDIAN(A2:A11) |
Returns the middle value of the dataset |
| Mode | =MODE.SNGL(A2:A11) |
Returns the most frequent number (use =MODE.MULT() for multiple modes) |
| Minimum | =MIN(A2:A11) |
Returns the smallest value |
| Maximum | =MAX(A2:A11) |
Returns the largest value |
| Range | =MAX(A2:A11)-MIN(A2:A11) |
Calculates the spread from smallest to largest |
| Standard Deviation (sample) | =STDEV.S(A2:A11) |
Estimates SD for a sample using Bessel’s correction |
| Standard Deviation (population) | =STDEV.P(A2:A11) |
Standard deviation if you have the full population |
Example Setup
To get started:
- Enter your dataset (e.g., delivery times, prices, square footage) into a single column, say from
A2toA11. - Select blank cells next to your data and enter the formulas above.
- Let Excel do the calculations for you! You can even use conditional formatting to highlight outliers and trends.
Notes:
- All formulas can be used directly in Google Sheets as well.
- Make sure not to include column headers (like "Price") inside function ranges.
- You can also use
=VAR.S()and=VAR.P()to compute the variance directly.
We’ll continue using spreadsheets throughout the semester in our project analysis and examples, so this is a great time to get comfortable with these tools!


