3.3: Measures of Variation Version 2
- Page ID
- 58257
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)↵
- Explain measures of variation to assess how spread-out data values are.
- Calculate the range, variance, and standard deviation for grouped and ungrouped data.
- Understand the importance of variation in evaluating consistency and comparing data sets.
Range
Variability is an important idea in statistics. If you were to measure the height of everyone in your classroom, every observation gives you a different value. That means not every student has the same height. Thus, there is variability in people’s heights. If you were to take a sample of the income level of people in a town, every sample would give you different information. There is variability between samples, too. Variability describes how the data are spread out. If the data are very close to each other, then there is low variability. If the data are very spread out, then there is high variability. How do you measure variability? It would be good to have a number that measures it. This section will describe some of the different measures of variability, also known as variation.
In Example \(\PageIndex{1}\), the average weight of a cat was calculated to be 8.02 pounds. How much does this tell you about the weight of all cats? Can you tell if most of the weights were close to 8.02, or were the weights spread out? What are the highest and the lowest weights? All you know is that the center of the weights is 8.02 pounds. You need more information.
The range (R) of a set of data is the difference between the highest and the lowest data values.
\[\begin{align*} \text{R} &= \text{highest value} - \text{lowest value} \\[4pt] \end{align*}\]
Look at the following three sets of data. Find the range of each of these.
- \(10, 20, 30, 40, 50\)
- \(10, 29, 30, 31, 50\)
- \(28, 29, 30, 31, 32\)
Solution
a.
b.
c.
Based on the mean, median, and range in Example \(\PageIndex{1}\), the first two distributions are the same, but you can see from the graphs that they are different. In Example \(\PageIndex{1}\)a the data are spread out equally. In Example \(\PageIndex{1}\)b, the data has a clump in the middle and a single value at each end. The mean and median are the same for Example \(\PageIndex{1}\)c, but the range is very different. All the data is clumped together in the middle.
The range doesn’t provide a very accurate picture of the variability. A better way to describe how the data is spread out is needed. Instead of looking at the distance, the highest value is from the lowest How about looking at the distance each value is from the mean? This distance is called the deviation.
Variance and Standard Deviation of Raw Data
Suppose a vet wants to analyze the weights of cats. The weights (in pounds) of five cats are 6.8, 8.2, 7.5, 9.4, and 8.2. Find the standard deviation for each of the data values.
Solution
Variable: \(x=\) weight of a cat
The mean for this data set is \(\overline{x}=8.02\) pounds.
| \(x\) | \(x-\overline{x}\) |
|---|---|
| 6.8 | 6.8-8.02 = -1.22 |
| 8.2 | 8.2-8.02=0.18 |
| 7.5 | 7.5-8.02=-0.52 |
| 9.4 | 9.4-8.02=1.38 |
| 8.2 | 8.2-8.02=0.18 |
Now you might want to average the deviation, so you need to add the deviations together.
| \(x\) | \(x-\overline{x}\) |
|---|---|
| 6.8 | 6.8-8.02 = -1.22 |
| 8.2 | 8.2-8.02=0.18 |
| 7.5 | 7.5-8.02=-0.52 |
| 9.4 | 9.4-8.02=1.38 |
| 8.2 | 8.2-8.02=0.18 |
| Total | 0 |
This can’t be right. The average distance from the mean cannot be 0. The reason it adds to 0 is that there are some positive and negative values. You need to get rid of the negative signs. How can you do that? You could square each deviation.
| \(x\) | \(x-\overline{x}\) | \((x-\overline{x})^{2}\) |
|---|---|---|
| 6.8 | 6.8-8.02 = -1.22 | 1.4884 |
| 8.2 | 8.2-8.02=0.18 | 0.0324 |
| 7.5 | 7.5-8.02=-0.52 | 0.2704 |
| 9.4 | 9.4-8.02=1.38 | 1.9044 |
| 8.2 | 8.2-8.02=0.18 | 0.0324 |
| Total | 0 | 3.728 |
Now, average the total of the squared deviations. The only thing is that in statistics, there is a strange average here. Instead of dividing by the number of data value,s you divide by the number of data values minus 1 because all the values in the data are free to be any value except for one. It has to be a specific value to calculate the sample mean. The number of numbers that are free is called the degrees of freedom. Since the sample mean is part of the variance formula, the denominator will be the total number of data values minus one. In this case,e you would have
\(s^{2}=\dfrac{3.728}{5-1}=\dfrac{3.728}{4}=0.932 \text { pounds }^{2}\)
Notice that this is denoted as \(s^{2}\). This is called the variance, and it is a measure of the average squared distance from the mean. If you now take the square root, you will get the average distance from the mean. This is called the standard deviation, and is denoted with the letter \(s\).
\(s=\sqrt{.932} \approx 0.965\) pounds
The standard deviation is the average (mean) distance from a data point to the mean. It can be thought of as how much a typical data point differs from the mean.
The sample variance formula:
\(s^{2}=\dfrac{\sum(x-\overline{x})^{2}}{n-1}\)
where \(\overline{x}\) is the sample mean, \(n\) is the sample size, and \(\sum\) means to find the sum.
The sample standard deviation formula:
\(s=\sqrt{s^{2}}=\sqrt{\dfrac{\sum(x-\overline{x})^{2}}{n-1}}\)
The \(n-1\) on the bottom has to do with a concept called degrees of freedom. It makes the sample standard deviation a better approximation of the population standard deviation.
The population variance formula:
\(\sigma^{2}=\dfrac{\sum(x-\mu)^{2}}{N}\)
where \(\sigma\) is the Greek letter sigma and \(\sigma^{2}\) represents the population variance, \(\mu\) is the population mean, and N is the size of the population.
The population standard deviation formula:
\(\sigma=\sqrt{\sigma^{2}}=\sqrt{\dfrac{\sum(x-\mu)^{2}}{N}}\)
The sum of the deviations should always be 0. If it isn’t, then it is because you rounded, you used the median instead of the mean, or you made an error. Try not to round too much in the calculations for standard deviation, since each rounding causes a slight error
Suppose that a manager wants to test two new training programs. He had 5 people for each training type and measured the time it took to complete a task after the training. Both training groups include all the recruits. These two groups represent two different populations. The times for both trainings are in Example \(\PageIndex{4}\). Which training method is better?
| Training 1 | 56 | 75 | 48 | 63 | 59 |
|---|---|---|---|---|---|
| Training 2 | 60 | 58 | 66 | 59 | 58 |
Solution
It is important that you define what each variable is, since there are two of them.
Variable 1: \(X_{1}=\) productivity from training 1
Variable 2: \(X_{2}=\) productivity from training 2
To answer which training method is better, first you need some descriptive statistics. Start with the mean for each sample.
\(\mu_{1}=\dfrac{56+75+48+63+59}{5}=60.2\) minutes
\(\mu_{2}=\dfrac{60+58+66+59+58}{5}=60.2\) minutes
Since both means are the same values, you cannot answer the question about which is better. Now, calculate the standard deviation for each population.
| \(x_{1}\) | \(x_{1}-\mu_{1}\) | \(\left(x_{1}-\mu_{1}\right)^{2}\) |
|---|---|---|
| 56 | -4.2 | 17.64 |
| 75 | 14.8 | 219.04 |
| 48 | -12.2 | 148.84 |
| 63 | 2.8 | 7.84 |
| 59 | -1.2 | 1.44 |
| Total | 0 | 394.8 |
| \(x_{2}\) | \(x_{2}-\mu_{2}\) | \(\left(x_{2}-\mu_{2}\right)^{2}\) |
|---|---|---|
| 60 | -0.2 | 0.04 |
| 58 | -2.2 | 4.84 |
| 66 | 5.8 | 33.64 |
| 59 | -1.2 | 1.44 |
| 58 | -2.2 | 4.84 |
| Total | 0 | 44.8 |
The variance for each population is:
\(\sigma_{1}^{2}=\dfrac{394.8}{5-1}=98.7 \text { minutes }^{2}\)
\(\sigma_{2}^{2}=\dfrac{44.8}{5-1}=11.2 \text { minutes }^{2}\)
The standard deviations are:
\(\sigma_{1}=\sqrt{98.7} \approx 9.93\) minutes
\(\sigma_{2}=\sqrt{11.2} \approx 3.35\) minutes
From the standard deviations, the second training seemed to be the better training since the data is less spread out. This means it is more consistent. It would be better for the managers in this case to have a training program that produces more consistent results so they know what to expect for the time it takes to complete the task.
You can do the calculations for the descriptive statistics using the technology. The procedure for calculating the sample mean ( \(\overline{x}) \) and the sample standard deviation ( \(s_{x}\)) for \(X_{2}\) in Example \(\PageIndex{3}\) on the TI-83/84 is in Figures 3.2.1 through 3.2.4 (the procedure is the same for \(X_{1}\)). Note that the calculator gives you the population standard deviation ( \(\sigma_{x}\) ) because it doesn’t know whether the data you input is a population or a sample. You need to decide which value you need to use, based on whether you have a population or a sample. In almost all cases, you have a sample and will be using \(s_{x}\). Also, the calculator uses the notation \(s_{x}\) instead of just \(s\). It is just a way for it to denote the information. First, you need to go into the STAT menu and then Edit. This will allow you to type in your data (see Figure \(\PageIndex{1}\)).
Once you have the data into the calculator, you then go back to the STAT menu, move over to CALC, and then choose 1-Var Stats (see Figure \(\PageIndex{2}\)). The calculator will now put 1-Var Stats on the main screen. Now type in L2 (2nd button and 2) and then press ENTER. (Note: if you have the newer operating system on the TI-84, then the procedure is slightly different.) The results from the calculator are in Figure \(\PageIndex{4}\).
In general, a “small” standard deviation means the data is close together (more consistent), and a “large” standard deviation means the data is spread out (less consistent). Sometimes you want consistent data, and sometimes you don’t. As an example,e if you are making bolts, you want to lengths to be very consistent, so you want a small standard deviation. If you are administering a test to see who can be a pilot, you want a large standard deviation so you can tell who the good pilots are and who the bad ones.
What do “small” and “large” mean? To a bicyclist whose average speed is 20 mph, s = 20 mph is huge. To an airplane whose average speed is 500 mph, s = 20 mph is nothing. The “size” of the variation depends on the size of the numbers in the problem and the mean. Another situation where you can determine whether a standard deviation is small or large is when you are comparing two different samples, such as in example #3.2.3. A sample with a smaller standard deviation is more consistent than a sample with a larger standard deviation.
Many other books and authors stress that there is a computational formula for calculating the standard deviation. However, this formula doesn’t give you an idea of what standard deviation is and what you are doing. It is only good for doing the calculations quickly. It goes back to the days when standard deviations were calculated by hand, and the person needed a quick way to calculate the standard deviation. It is an archaic formula that this author is trying to eradicate it. It is not necessary anymore, since most calculators and computers will do the calculations for you with as much meaning as this formula gives. It is suggested that you never use it. If you want to understand what the standard deviation is doing, then you should use the definition formula. If you want an answer quickly, use a computer or calculator.
Use of Standard Deviation
One of the uses of the standard deviation is to describe how a population is distributed by using Chebyshev’s Theorem. This theorem works for any distribution, whether it is skewed, symmetric, bimodal, or any other shape. It gives you an idea of how much data is a certain distance on either side of the mean.
For any set of data:
- At least 75% of the data fall in the interval from \(\mu-2 \sigma \text { to } \mu+2 \sigma\).
- At least 88.9% of the data fall in the interval from \(\mu-3 \sigma \text { to } \mu+3 \sigma\).
- At least 93.8% of the data fall in the interval from \(\mu-4 \sigma \text { to } \mu+4 \sigma\).
The U.S. Weather Bureau has provided the information in Example \(\PageIndex{7}\) about the total annual number of reported strong to violent (F3+) tornados in the United States for the years 1954 to 2012. ("U.S. tornado climatology," 17).
| 46 | 47 | 31 | 41 | 24 | 56 | 56 | 23 | 31 | 59 |
|---|---|---|---|---|---|---|---|---|---|
| 39 | 70 | 73 | 85 | 33 | 38 | 45 | 39 | 35 | 22 |
| 51 | 39 | 51 | 131 | 37 | 24 | 57 | 42 | 28 | 45 |
| 98 | 35 | 54 | 45 | 30 | 15 | 35 | 64 | 21 | 84 |
| 40 | 51 | 44 | 62 | 65 | 27 | 34 | 23 | 32 | 28 |
| 41 | 98 | 82 | 47 | 62 | 21 | 31 | 29 | 32 |
- Use Chebyshev’s theorem to find an interval centered on the mean annual number of strong to violent (F3+) tornadoes in which you would expect at least 75% of the years to fall.
- Use Chebyshev’s theorem to find an interval centered on the mean annual number of strong to violent (F3+) tornadoes in which you would expect at least 88.9% of the years to fall.
Solution
a. Variable: \(x =\) number of strong or violent (F3+) tornadoes Chebyshev’s theorem says that at least 75% of the data will fall in the interval from \(\mu-2 \sigma\) to \(\mu+2 \sigma\).
You do not have the population, so you need to estimate the population mean and standard deviation using the sample mean and standard deviation. You can find the sample mean and standard deviation using technology:
\(\overline{x} \approx 46.24, s \approx 22.18\)
So,
\(\mu \approx 46.24, \sigma \approx 22.18\)
\(\mu-2 \sigma \text { to } \mu+2 \sigma\)
\(46.24-2(22.18) \text { to } 46.24+2(22.18)\)
\(46.24-44.36 \text { to } 46.24+44.36\)
\(1.88 \text { to } 90.60\)
Since you can’t have a fractional number of tornadoes, round to the nearest whole number.
At least 75% of the years have between 2 and 91 strong to violent (F3+) tornadoes. (Actually, all but three years’ values fall in this interval, which means that \(\dfrac{56}{59} \approx 94.9 \%\) fall in the interval.)
b. Variable: \(x =\) number of strong or violent (F3+) tornadoes Chebyshev’s theorem says that at least 88.9% of the data will fall in the interval from \(\mu-3 \sigma\) to \(\mu+3 \sigma\).
\(\mu-3 \sigma \text { to } \mu+3 \sigma\)
\(46.24-3(22.18) \text { to } 46.24+3(22.18)\)
\(46.24-66.54 \text { to } 46.24+66.54\)
\(-20.30 \text { to } 112.78\)
Since you can’t have a negative number of tornadoes, the lower limit is 0. Since you can’t have a fractional number of tornadoes, round to the nearest whole number.
At least 88.9% of the years have between 0 and 113 strong to violent (F3+) tornadoes.
(Actually, all but one year fall in this interval, which means that \(\dfrac{58}{59} \approx 98.3 \%\) fall in the interval.)
Chebyshev’s Theorem says that at least 75% of the data is within two standard deviations of the mean. That percentage is fairly high. There isn’t much data outside two standard deviations. A rule that can be followed is that if a data value is within two standard deviations, then that value is a common data value. If the data value is outside two standard deviations of the mean, either above or below, then the number is uncommon. It could even be called unusual. An easy calculation that you can do to figure it out is to find the difference between the data point and the mean, and then divide that answer by the standard deviation. As a formula, this would be
\(\dfrac{x-\mu}{\sigma}\).
If you don’t know the population mean, \(\mu\), and the population standard deviation, \(\sigma\), then use the sample mean, \(\overline{x}\), and the sample standard deviation, \(s\), to estimate the population parameter values. However, realize that using the sample standard deviation may not be very accurate.
- In 1974, there were 131 strong or violent (F3+) tornadoes in the United States. Is this value unusual? Why or why not?
- In 1987, there were 15 strong or violent (F3+) tornadoes in the United States. Is this value unusual? Why or why not?
Solution
a. Variable: \(x =\) number of strong or violent (F3+) tornadoes
To answer this question, first find how many standard deviations 131 is from the mean. From Example \(\PageIndex{4}\), we know \(\mu \approx 46.24\) and \(\sigma \approx 22.18\). For \(x = 131\),
\(\dfrac{x-\mu}{\sigma}=\dfrac{131-46.24}{22.18} \approx 3.82\)
Since this value is more than 2, it is unusual to have 131 strong or violent (F3+) tornadoes in a year.
b. Variable: \(x =\) number of strong or violent (F3+) tornadoes For this question the \(x = 15\),
\(\dfrac{x-\mu}{\sigma}=\dfrac{15-46.24}{22.18} \approx-1.41\)
Since this value is between -2 and 2, it is not unusual to have only 15 strong or violent (F3+) tornadoes in a year.
Variance and Standard Deviation for Grouped Data
The steps to find the sample variance (\(s^2\)) and standard deviation (s) for grouped data is provided below.
Step 1: Make a table as shown and find the midpoint of each class.
| Class | Frequency: f | Midpoint: XM | \(f \cdot X_M\) | \(f \cdot (X_M)^2\) |
|---|---|---|---|---|
Table \(\PageIndex{8}\): Headers of Grouped Frequency Distribution
Step 2: Find the sum of the second and last two columns. Substitute the sums into the formula below and solve to get the variance.
\(s^2=\dfrac{n(\sum f\cdot x_m^2)_{}-(\sum f\cdot x_m)^2}{n(n-1)}\)
Step 3: Take the square root to get the standard deviation.
\(s=\sqrt{\text{variance}} \)
Find the variance and standard deviation for the frequency distribution of the data.
| Class Boundaries | Class Boundaries |
|---|---|
| 5.5 - 10.5 | 1 |
| 10.5 - 15.5 | 2 |
| 15.5 - 20.5 | 3 |
| 20.5 - 25.5 | 5 |
| 25.5 - 30.5 | 4 |
| 30.5 - 35.5 | 3 |
| 35.5 - 40.5 | 2 |
Table \(\PageIndex{9}\): Grouped Frequency Distribution
Solution
| Class Boundaries | Frequency | XM | f\(\cdot\)XM | f\(\cdot\)(XM)2 |
|---|---|---|---|---|
|
5.5 – 10.5 |
1 |
8 |
8 |
64 |
|
10.5 – 15.5 |
2 |
13 |
26 |
338 |
|
15.5 – 20.5 |
3 |
18 |
54 |
972 |
|
20.5 – 25.5 |
5 |
23 |
115 |
2645 |
|
25.5 – 30.5 |
4 |
28 |
112 |
3136 |
|
30.5 – 35.5 |
3 |
33 |
99 |
3267 |
|
35.5 – 40.5 |
2 |
38 |
76 |
2888 |
|
20 |
490 |
13310 |
Table \(\PageIndex{10}\):Grouped Frequency Distribution with Added Columns
Compute the grouped variance (\(s^2\)) and standard deviation (s):
\(s^2=\dfrac{n(\sum f\cdot x_m^2)_{}-(\sum f\cdot x_m)^2}{n(n-1)}\)
\(s^2=\dfrac{20\left(13310\right)-\left(490\right)^2}{20\left(19\right)}\)
\(s^2=\dfrac{26100}{380}\approx68.68421053\)
\(s\approx\sqrt{\text{68.68421053}} \)
\(s\approx8.287593772\)
If we round to two decimal places the answers for the sample variance (\(s^2\)) and standard deviation (s) for the grouped data, we will get the following:
\(s^2\approx68.68\)
\(s\approx8.29\)
Authors
"3.3: Measures of Variation Version 2" by Toros Berberyan, Tracy Nguyen, and Alfie Swan is licensed under CC BY-SA 4.0
Attributions
"3.2: Measures of Spread" by Kathryn Kozak is licensed CC BY-SA 4.0
Exercises
- Cholesterol levels for all patients two days after they had a heart attack (Ryan, Joiner & Ryan, Jr, 1985) and are in Example \(\PageIndex{8}\). Find the range, variance, and standard deviation. Round answers to whole numbers.
178, 194, 209, 184, 219, 189, 204, 199
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- The dataset below shows the grades of every student in the statistics class. Find the range, variance, and standard deviation. Write the exact value for the range, round the variance to three decimal places, and round the standard deviation to two decimal places.
68, 91, 75, 88, 81, 89
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- A manufacturing firm has recorded the ages of its employees and grouped them into the following table. Find the variance and standard deviation. Round the answer to one decimal place.
| Ages | Frequency |
|---|---|
| 17 - 23 | 5 |
| 24 - 30 | 7 |
| 31 - 37 | 9 |
| 38 - 44 | 10 |
| 45 - 51 | 4 |
| 52 - 58 | 7 |
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- Find the variance and standard deviation for the grouped data provided below. Round the answers to two decimal places.
| Class Limits | Frequency |
|---|---|
| 1 - 10 | 7 |
| 11 - 20 | 14 |
| 21 - 30 | 8 |
| 31 - 40 | 11 |
| 41 - 50 | 9 |
| 51 - 60 | 4 |
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- Answers
-
If you are an instructor and want the solutions to all the exercise questions for each section, please email Toros Berberyan.






