10.2: Goodness-of-Fit
- Page ID
- 49074
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
In the last section, we were introduced to the chi-square distribution and its relationship to the normal distribution. We will use the process for \(\chi^2\) tests in multinomial experiments (with more than two outcomes per observation). These tests are often referred to as Goodness-of-Fit tests.
Step 1: Determine the Hypotheses
The goodness of fit test makes claims about the proportions or probabilities for each outcome of a multinomial experiment. If there are k outcomes per trial, then the null hypothesis would be
\[H_0: p_1=\text { value}_1,\ p_2=\text { value}_2, \ldots, p_k=\text { value}_k\nonumber\]
The sum of the hypothetical proportions in the null hypothesis must add to 1.
For the null hypothesis to not be true, one or more of the proportions would be incorrect. Thus, the alternative is stated as a sentence, and takes the form
\[H_a: \text { At least one of these proportions is incorrect. }\nonumber\]
Step 2: Collect the Data
Using a sample of n independent trials, each having \(k\) outcomes, the multinomial experiment is conducted by collecting categorical/qualitative data from a random sample. For each outcome, the observed frequency, \(O_i\), is the number of observed successes. The sum of these observed frequencies is always the sample size, \[n=\sum_i O_i\nonumber\] so the last observed frequency does not vary freely. Therefore, there are \(k-1\) degrees of freedom among the observed frequencies.
To compute the expected frequencies, we take the sample size and multiply by each proportion assumed in the null hypothesis, \(E_i=n p_i\).
For approximate normality in a binomial experiment, we require at least 10 expected successes and failures in the sample. In a multinomial experiment, the criteria is softened because it can require unreasonably large sample sizes. We say the test statistic is approximately distributed according to the chi-square distribution if each expected frequency, \(E_i\), is at least 5.
Step 3: Assess the Evidence
For goodness of fit tests, the test statistic is
\[\chi^2=\sum_i \frac{\left(O_i-E_i\right)^2}{E_i}\nonumber\]
The shape of the distribution depends on the degrees of freedom. In every case, the distribution starts at 0 and is skewed to the right. The mode (peak) occurs at two less than the number of degrees of freedom.
A goodness of fit test is always right tailed. This is because the test statistic involves squaring an error which will always result in a positive number. If the null hypothesis is false, we expect the test statistic to be large.
We use this desmos graph (https://www.desmos.com/calculator/viuelise2r) to compute the P-value. The P-value is the area of the right tail (starting from the test statistic) under the chi-square distribution with \(k-1\) degrees of freedom.
Images are created with the graphing calculator, used with permission from Desmos Studio PBC.
Step 4: Make a Decision and State a Conclusion
Compare the P-value to the level of significance \((\alpha)\). If the P-value is less than or equal to \(\alpha\), we reject the null hypothesis in support of the alternative hypothesis. If the P-value is greater than \(\alpha\), we fail to reject the null hypothesis, and we cannot support the alternative hypothesis.
You try!
Below is the distribution of households total income in 2020 according to the United States Census Bureau:
Income per household |
Proportion |
---|---|
Under $35,000 |
26.2% |
Between $35,000 and $100,000 |
40.3% |
Over $100,000 |
33.5% |
You want to know if these proportions are different for black households.
Step 1
- How many possible outcomes \((k)\) for each black household are there? \(k=\) _______________
- State the null hypothesis:
\(H_0: p_1=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ },\ p_2=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ },\ p_3=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\)
Step 2
You randomly survey black households for their household income. The observed frequencies are below.
- Calculate the sample size and expected frequencies for each category in the table below.
Income per black household |
Observed frequency |
Expected frequency |
---|---|---|
Under $35,000 |
\(O_1=40\) |
\(E_1=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\) |
Between $35,000 and $100,000 |
\(O_2=41\) |
\(E_2=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\) |
Over $100,000 |
\(O_3=19\) |
\(E_3=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\) |
Total |
\(n=\Sigma O_i=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\) |
- Do the expected frequencies satisfy the conditions of an approximate chi-square distribution? Explain.
- What are the degrees of freedom for this test? \(d f=k-1=\underline{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\)
Step 3
- Fill in the blanks with the appropriate values and compute the \(\chi^2\) test statistic rounded to four decimal places:
\(\chi^2=\displaystyle\sum_{i=1}^{k=3} \frac{\left(O_i-E_i\right)^2}{E_i}=\frac{\left(O_1-E_1\right)^2}{E_1}+\frac{\left(O_2-E_2\right)^2}{E_2}+\frac{\left(O_3-E_3\right)^2}{E_3}=\frac{(\underline{\ \ \ \ \ }-\underline{\ \ \ \ \ })^2}{\underline{\ \ \ \ \ }}+\frac{(\underline{\ \ \ \ \ }-\underline{\ \ \ \ \ })^2}{\underline{\ \ \ \ \ }}+\frac{(\underline{\ \ \ \ \ }-\underline{\ \ \ \ \ })^2}{\underline{\ \ \ \ \ }} \approx\underline{\ \ \ \ \ \ \ }\)
- Use this desmos graph, https://www.desmos.com/calculator/bjohldwaym, to compute the P-value rounded to four decimal places. Plot the \(\chi^2\) statistic and shade the area to the right.
Images are created with the graphing calculator, used with permission from Desmos Studio PBC.
P-value = __________________
Step 4
- Using a 1% level of significance, make a decision about the null and alternative hypotheses.
- State the conclusion in context.