11.1: Goodness of Fit Test

Last updated
Save as PDF

Page ID: 52752

Toros Berberyan, Tracy Nguyen, and Alfie Swan
Citrus College

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\dsum}{\displaystyle\sum\limits} \)

\( \newcommand{\dint}{\displaystyle\int\limits} \)

\( \newcommand{\dlim}{\displaystyle\lim\limits} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Learning Objectives

Understand how to perform a chi-square goodness of fit test.
Compare observed frequencies to expected frequencies based on a theoretical distribution.
Determine whether a sample distribution matches an expected distribution.
Use the test to assess how well data fits a claimed or known distribution.
Interpret the results to evaluate the fit between observed data and expected outcomes.

The \(\chi^{2}\) goodness-of-fit test can be used to test the distribution of three or more proportions within a single population. The \(\chi^{2}\) goodness of fit test is used to determine whether observed categorical data match a theoretical or expected distribution based on known or claimed proportions.

Definition: \(\chi^{2}\) Goodness-of-Fit Test

The \(\chi^{2}\)-test is a statistical test for testing the goodness-of-fit of a variable. It can be used when the data are obtained from a random sample and when the expected frequency (E) from each category is 5 or more.

The formula for the \(\chi^{2}\) -test statistic is:

\[\chi^{2}=\sum \frac{(o-E)^{2}}{E}\]

Use a right-tailed \(\chi^{2}\) -distribution with \(\text{df} = k-1\) where \(k\) = the number of categories.

with

\(O\) = the observed frequency (what was observed in the sample) and
\(E\) = the expected frequency (based on \(H_{0}\) and the sample size).
\(H_{0}:\) The observed proportions match the expected proportions.
\(H_{1}:\) The observed proportions do not match the expected proportions.

Example \(\PageIndex{1}\) Goodness-of-Fit Test

An instructor claims that their students’ grade distribution is different than the department’s grade distribution. The department’s grades have the following proportion of students who get A’s is 35%, B’s is 23%, C’s is 25%, D’s is 10%, and F’s is 7%, in introductory statistics courses. For a sample of 250 introductory statistics students with this instructor, there were 80 A’s, 50 B’s, 58 C’s, 38 D’s, and 24 F’s. Test the instructor’s claim at the 5% level of significance.

Solution

State the claim and determine the null and alternative hypotheses.

This is a test for three or more proportions within a single population, so use the goodness-of-fit test. We will always use a right-tailed \(\chi^{2}\) test. The hypotheses for this example would be:

\(H_{0}:\) The observed proportions match the expected proportions

\(H_{1}:\) The observed proportions do not match the expected proportions (claim)

Even though there is an inequality in \(H_{1}\), the goodness-of-fit test is always right-tailed. This is because we are testing to see if there is a large variation between the observed versus the expected values. If the variance between the observed and expected values is large, then there is a difference in the proportions.

Determine the critical value using the \(\chi^{2}\) table. The degrees of freedom are k – 1 (where k is the number of categories). Thus, d.f. = 5 – 1 = 4 and \( \alpha = 0.05\). The critical value is 9.488.

There are \(k = 5\) categories that we are comparing: A’s, B’s, C’s, D’s, and F’s.

The observed counts are the actual number of A’s, B’s, C’s, D’s, and F’s from the sample.

Compute the expected count for each of the five categories. Find the expected counts by multiplying the expected proportion of A’s, B’s, C’s, D’s, and F’s by the sample size (which is 250). Round the final answer to three decimal places.

It will be helpful to make a table to organize the work using the table below.

Observed and Expected Data Values
Grade	A's	B's	C's	D's	F's
Observed (O)	80	50	58	38	24
Expected (E)	0.35 \(\cdot\) 250 = 87.5	0.23 \(\cdot\) 250 = 57.5	0.25 \(\cdot\) 250 = 62.5	0.10 \(\cdot\) 250 = 25	0.07 \(\cdot\) 250 = 17.5
\(\dfrac{(O - E)^2}{E}\)	\(\dfrac{(80-87.5)^2}{87.5}\) = 0.6429	\(\dfrac{(50-57.5)^2}{57.5}\) = 0.9738	\(\dfrac{(58-62.5)^2}{62.5}\) = 0.324	\(\dfrac{(38-25)^2}{25}\) = 6.76	\(\dfrac{(24-17.5)^2}{17.5}\) = 2.4143

Table \(\PageIndex{1}\): Calculation of the Chi-Square Test Statistic

The first row is grades.
The second row is the observed values.
The third row is the expected value for each grade.
In the last row for each cell, subtract the observed value from the expected value. Then, square each difference and divide each square by the expected value.

The test statistic is the sum of this last row: \(\chi^{2}=\sum \frac{(O-E)^{2}}{E}=0.6429+0.9783+0.324+6.76+2.4143=11.120\)

Decide whether to reject or not reject the null hypothesis.

Reject the null hypothesis. — Figure \(\PageIndex{1}\): Reject the Null Hypothesis as the Test Point Falls in the Critical Region

The test statistic of \(\chi^{2} = 11.12 > 9.4877\) is in the rejection area, so the decision is to reject \(H_{0}\).

Summarize the results.

There is sufficient evidence to support the claim that the proportion of students who get A’s, B’s, C’s, D’s and F’s in introductory statistics courses for this instructor is different than the department’s proportions of 35%, 23%, 25%, 10% and 7% respectively.

TI-84+: The test point can be computed using a built-in function in the calculator.

Enter the data by pressing [STAT] and then choosing 1:Edit.
In L1, enter the observed values, and in L2, enter the expected values

Figure \(\PageIndex{2}\): Enter the Observed Values and Expected Values

Press [STAT], then scroll right to TESTS. Select D:χ²GOF-Test (scroll down if needed).

Select the chi square goodness of fit test.

Figure \(\PageIndex{3}\): Select the \(\chi^{2}\) Goodness of Fit Test

In the prompt, make sure Observed: L₁, Expected: L₂, and df: 4 (this is degrees of freedom). Also, press [Calculate].

Make sure proper lists are selected along with degrees of freedom.

Figure \(\PageIndex{4}\): Make Sure Proper Lists are Selected Along With Degrees of Freedom

4. The test point is the value \(\chi^{2}\) = 11.11940373, which rounds to 11.12.

The output displays the chi-square test point.

Figure \(\PageIndex{5}\): The Output Displays the \(\chi^{2}\) Test Point

Example \(\PageIndex{2}\)

Suppose you have a die that you are curious about whether it is fair or not. If it is fair, then the proportion for each value should be the same. You need to find the observed frequencies, and to accomplish this, you roll the die 500 times and count how often each side comes up. Do the data show that the die is fair? Test at the 10% level.

Table \(\PageIndex{1}\): Observed Frequencies of Die
Die values	1	2	3	4	5	6	Total
Observed Frequency	78	87	87	76	85	87	100

Solution

1. State the null and alternative hypotheses and the level of significance

\(H_{0}\): The observed frequencies are consistent with the distribution for a fair die (the die is fair)

\(H_{1}\): The observed frequencies are not consistent with the distribution for a fair die (the die is not fair)

\(\alpha\) = 0.10

2. Determine the critical value using the \(\chi^{2}\) table. The degrees of freedom are k – 1 (where k is the number of categories). Thus, d.f. = 6 – 1 = 5 and \( \alpha = 0.10\). The critical value is 9.236.

3. Find the test statistic and p-value

First, you need to find the probability of rolling each side of the die. The sample space for rolling a die is {1, 2, 3, 4, 5, 6}. Since you are assuming that the die is fair, then \(P(1)=P(2)=P(3)=P(4)=P(5)=P(6)=\dfrac{1}{6}\).

Now you can find the expected frequency for each side of the die. Since all the probabilities are the same, each expected frequency is the same.

\(\text{Expected Frequency} =E=n \cdot p=500 \cdot \dfrac{1}{6} \approx 83.33\)

Test Statistic:

It is easier to calculate the test statistic using a table.


O	E	O-E	\((O-E)^{2}\)	\(\dfrac{(O-E)^{2}}{E}\)
78	83.33	-5.22	28.4089	0.340920437
87	83.33	3.67	13.4689	0.161633265
87	83.33	3.67	13.4689	0.161633265
76	83.33	-7.33	53.7289	0.644772591
85	83.33	1.67	2.7889	0.033468139
87	83.33	3.67	13.4689	0.161633265
Total		0.02		\(\chi^{2} \approx 1.504060962\)

Table \(\PageIndex{2}\): Calculation of the Chi-Square Test Statistic

The test statistic is rounded to \(\chi^{2} \approx 1.504\)

4. Decide whether to reject or not reject the null hypothesis.

Do not reject the null yypothesis. — Figure \(\PageIndex{6}\): Do not reject the Null Hypothesis as the Test Point Falls in the Non-Critical Region

The result is do not reject the null hypothesis (H₀).

5. Summarize the results.

There is not enough evidence to show that the die is not consistent with the distribution for a fair die. There is not enough evidence to show that the die is not fair.

Authors

"11.1: Goodness of Fit Test" by Toros Berberyan, Tracy Nguyen, and Alfie Swan is licensed under CC BY-SA 4.0

Attributions

"11.2: Chi-Square Goodness of Fit" by Kathryn Kozak is licensed under CC BY-SA 4.0

"10.2: Goodness of Fit Test" by Rachel Webb is licensed under CC BY-SA 4.0

Search

Text Color

Text Size

Margin Size

Font Type

Solution

Solution