10.3: Difference between Two Variances - the F Distributions
- Page ID
- 51904
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Here we have to assume that the two populations (as opposed to sample mean distributions) have a distribution that is almost normal as shown in Figure 10.2.
Figure 10.2: Two normal populations lead to two \(\chi^{2}\) distributions that represent distributions of sample variances. The \(F\) distribution results when you build up a distribution of the ratio of the two \(\chi^{2}\) sample values.
The ratio \(\frac{s_{1}^{2}}{s_{2}^{2}}\) follows an \(F\)-distribution if \(\sigma_{1} = \sigma_{2}\). That \(F\) distribution has two degrees of freedom: one for the numerator (d.f.N. or \(\nu_{1}\)) and one for the denominator (d.f.D. or \(\nu_{2}\)). So we denote the distribution more specifically as \(F_{\nu_{1}, \nu_{2}}\). For the case of Figure 10.2, \(\nu_{1} = n_{1} - 1\) and \(\nu_{2} = n_{2} - 1\). The \(F\) ratio, in general is the result of the following stochastic process. Let \(X_{1}\) be random variable produced by a stochastic process with a \(\chi^{2}_{\nu_{1}}\) distribution and let \(X_{2}\) be random variable produced by a stochastic process with a \(\chi^{2}_{\nu_{2}}\) distribution. Then the random variable will, by definition, have a \(F_{\nu_{1}, \nu_{2}}\) distribution.
The exact shape of the \(F_{\nu_1, \nu_2}\) distribution depends on the choice of \(\nu_1\) and \(\nu_2\), But it roughly looks like a \(\chi^2\) distribution as shown in Figure 10.3.

\(F\) and \(t\) are related :
\[F_{1,\nu} = t^{2}_{\nu}\]so the \(t\) statistic can be viewed as a special case of the \(F\) statistic.
For comparing variances, we are interested in the follow hypotheses pairs :
Right-tailed | Left-tailed | Two-tailed |
\(H_0: \sigma^2_1 \leq \sigma^2_2\) | \(H_0: \sigma^2_1 \geq \sigma^2_2\) | \(H_0: \sigma^2_1 = \sigma^2_2\) |
\(H_1: \sigma^2_1 > \sigma^2_2\) | \(H_1: \sigma^2_1 < \sigma^2_2\) | \(H_1: \sigma^2_1 \neq \sigma^2_2\) |
We’ll always compare variances (\(\sigma^2\)) and not standard deviations (\(\sigma\)) to keep life simple.
The test statistic is
\[ F_{\rm test} = F_{\nu_1, \nu_2} = \frac{s^{2}_{1}}{s^{2}_{2}} \]where (for finding the critical statistic), \(\mu_{1} = n_{1} - 1\) and \(\mu_{2} = n_{2} - 1\).
Note that \(F_{\nu_1, \nu_2} = 1\) when \(s_{1}^{2}=s_{2}^{2}\), a fact you can use to get a feel for the meaning of this test statistic.
Values for the various \(F\) critical values are given in the F Distribution Table in the Appendix. We will denote a critical value of \(F\) with the notation :
\[F_{\rm crit} = F_{\alpha, \hspace{.1in}\nu_{1}, \hspace{.1in} \nu_2}\]Where:
\(\alpha\) = Type I error rate
\(\nu_{1}\) = d.f.N.
\(\nu_{2}\) = d.f.D.
The F Distribution Table gives critical values for small right tail areas only. This means that they are useless for a left-tailed test. But that does not mean we cannot do a left-tail test. A left-tail test is easily converted into a right tail test by switching the assignments of populations 1 and 2. To get the assignments correct in the first place then, always define populations 1 and 2 so that \(\sigma^{2}_{1} > \sigma^{2}_{2}\). Assign population 1 so that it has the largest sample variance. Do this even for a two-tail test because we will have no idea what \(F_{\rm crit}\) on the left side of the distribution is.
Example 10.3 : Given the following data for smokers and non-smokers (maybe its about some sort of disease occurrence, who cares, let’s focus on dealing with the numbers), test if the population variances are equal or not at \(\alpha = 0.05\).
Smokers | Nonsmokers |
\(n_{1} = 26\) | \(n_{2} = 18\) |
\(s_{1}^{2}=36\) | \(s_{2}^{2}=10\) |
Note that \(s_{1}^{2} > s_{2}^{2}\) so we’re good to go.
Solution :
1. Hypothesis.
\[\begin{equation*} $H_0: \sigma^2_1 = \sigma^2_2$ \\ $H_1: \sigma^2_1 \neq \sigma^2_2$ \end{equation*}\]2. Critical statistic.
Use the F Distribution Table; it is a bunch of tables labeled by “\(\alpha\)” that we will designate at \(\alpha_{T}\), the table values that signify right tail areas. Since this is a two-tail test, we need . Next we need the degrees of freedom:
So the critical statistic is
3. Test statistic.
\[F_{\nu_1, \nu_2} = \frac{s^2_1}{s^2_2}\] \[F_{\rm test} = F_{25, 17} = \frac{36}{10} = 3.6\]With this test statistic, we can estimate the \(p\)-value using the F Distribution Table. To find \(p\), look up all the numbers with d.f.N = 25 and d.f.N = 17 (24 \(\&\) 17 are the closest in the tables so use those) in all the the F Distribution Table and form your own table. For each column in your table record \(\alpha_{T}\) and the \(F\) value corresponding to the degrees of freedom of interest. Again, \(\alpha_{T}\) corresponds to for a two-tailed test. So make a row above the \(\alpha_{T}\) row with \(p = 2 \alpha_{T}\). (For a one-tailed test, we would put \(p=\alpha_{T}\).)
\(p\) \(\alpha_{T}\) |
0.20 0.10 0.05 0.02 0.01 0.10 0.05 0.025 0.01 0.005 |
\(F\) | 1.84 2.19 2.56 3.08 3.51 3.6 is over here somewhere so \(p<0.01\) |
Notice how we put an upper limit on \(p\) because \(F_{\rm test}\) was larger than all the \(F\) values in our little table.
Let’s take a graphical look at why we use \(p=2\alpha\) in the little table and for finding \(F_{\rm crit}\) for two tailed tests :
But in a two-tailed test we want \(\alpha\) split on both sides:
4. Decision.
Reject \(H_{0}\). The \(p\)-value estimate supports this :
\[ ( p < 0.01) < (\alpha = 0.05) \]5. Interpretation.
There is enough evidence to conclude, at \(\alpha = 0.05\) with an \(F\)-test, that the variance of the smoker population is different from the non-smoker population.
▢