16.4: Practice Goodness of Fit Pineapple on Pizza
 Page ID
 18147
There is a very passionate and ongoing debate on whether or not pineapple should go on pizza. Being the objective, rational data analysts that we are, we will collect empirical data to see if we can settle this debate once and for all. We gather data from a group of adults asking for a simple Like/Dislike answer.
Step 1: State the Hypotheses
We start, as always, with our hypotheses. ChiSquare focuses on patterns of relationship, so that's what the hypotheses in words should talk about. Let's go through research hypothesis to see how this all works out.
Example \(\PageIndex{1}\)
What is the research hypothesis in words for this scenario? Make sure to list which group you think will have a higher frequency.
Solution
 Research hypothesis in words: There will be a pattern of difference such that there will be more people who dislike pineapple on their pizza than people who like pineapple on their pizza.
The hypotheses in symbols focus on probabilities, but because of how ChiSquare works, we can only say that the probabilities will not be equal.
 Reseach hypothesis in symbols: \(P_{Like}\neq 0.50, P_{Dislike} \neq 0.50, or P \neq (0.50, 0.05)\)
The probability of 0.50 (which means a 50% chance) was found by knowing that we only have two options: Like or Dislike. All probabilities add up to 100% chance, so with only two options, we find \(\dfrac{100}{2} = 50 \) which means that the P (probability) is 0.50.
If this research hypothesis in symbols doesn't make sense, it might be easier to start with a null hypothesis in words and symbols, then figure out how that works out for the research hypothesis.
Example \(\PageIndex{2}\)
What is the null hypothesis in words and symbols for this scenario?
Solution
 Null hypothesis in words: There is no pattern of difference based on liking pineapple on pizza.
 Null hypothesis in symbols: \(P_{Like}\neq 0.50, P_{Dislike} \neq 0.50, or P \neq (0.50, 0.05)\)
Let's move on to an easier step!
Step 2: Find the Critical Value
Per usual, we will leave \(α\) at its typical level of 0.05. You can find the Critical Values of ChiSquare Table earlier in this chapter, or look for the link in the Common Critical Values page at the end of this book.
Exercise \(\PageIndex{1}\)
What is the critical value for this scenario?
 Answer

We have two options in our data (Like or Dislike), which will give us two categories (k=2). The Degrees of Freedom is found through k1, so we will have 1 df (k1=21=1). From our \(\chi^{2}\) table of critical values, we find a critical value of 3.841 for our \(\alpha\) of p=0.05.
See, that was easy! How, the slightlylesseasybutnotthathard step of calculating the ChiSquare test statistic.
Step 3: Calculate the Test Statistic
The results of the data collection are presented in Table \(\PageIndex{1}\).
Like  Dislike  Total  

Observed  19  26  19+26=45 
First, let's find the Expected values, then we'll fill have what we need to complete the full calculation.
Example \(\PageIndex{3}\)
With two categories and 45 scores, what is the Expected frequency?
Solution
\[E = \dfrac{45}{2} = 22.50 \nonumber \]
We can use the Observed and the Expected frequencies to calculate our \(\chi^{2}\) statistic either through a table or individually in the equation:
\[\chi^{2}=\sum_{Each}\left(\dfrac{\left(EO\right)^{2}}{E} \right)\nonumber \]
The first example will use the table.
Example \(\PageIndex{4}\)
Complete the calculations labeled to fill in Table \(\PageIndex{2}\).
Like  Dislike  Total  

Observed  19  26  45.00 
Expected  22.50  22.50  22.50+22.50=45.00 
Difference Score (E Minus O)  
Difference Score Squared  
Diff^{2} divided by Expected 
Solution
Like  Dislike  Total  

Observed  19  26  45.00 
Expected  22.50  22.50  45.00 
Difference Score (O Minus E)  22.5019=3.50  22.5026=3.50  N/A 
Difference Score Squared  \(3.50^2 = 12.25 \)  \(3.50^2 = 12.25 \)  N/A 
Diff^{2} divided by Expected  \(\dfrac{12.25}{22.50} = 0.54 \)  \(\dfrac{12.25}{22.50} = 0.54 \)  0.54+0.54=1.08 
You might have noticed that there are still two empty cells. You can add up the Difference Scores (they equal zero in this example) and the squared Difference Scores (they equal 24.50), but we don't use them for the \(\chi^2\) formula, so you can save some time and not calculate them.
Also, if you used a spreadsheet, the final sum of \(\dfrac{Diff^2}{E} = 1.09\); those darn rounding differences!
What would this look like in the ChiSquare formula?
Example \(\PageIndex{5}\)
Use the \(\chi^2\) formula with the Observed frequencies and Expected frequencies to calculate the test statistic for \(\chi^2\):
\[\chi^{2}=\sum_{Each}\left(\dfrac{\left(EO\right)^{2}}{E} \right)\nonumber \]
Solution
\[\chi^{2}= \dfrac{(22.5019)^{2}}{22.50} + \dfrac{(22.5026)^{2}}{22.50} = 0.54 + 0.54 = 1.08 \nonumber \]
Using the table to calculate the \(\chi^2\) and using the formula resulted in the same result (because you are doing the same things mathematically). It's your choice which option that you prefer. It seems easier to use the formula when there are so few categories (k), but the table seems easier to use when there are more categories. The table is also easier to use if you're using a spreadsheet.
Now that we have the calculated \(\chi^2\), we can make the decision!
Step 4: Make the Decision
Our observed test statistic had a value of 1.08 and our critical value was 3.84. What do we do if this is still true?
Note
Slightly modified from earlier versions to fit the hypotheses, but the idea is the same:
Critical \(<\) Calculated \(=\) Reject null \(=\) There is a pattern of relationship. \(= p<.05\)
Critical \(>\) Calculated \(=\) Retain null \(=\) There is no pattern of relationship. \(= p>.05\)
Based on this note...
Example \(\PageIndex{6}\)
Do we retain or reject the null hypothesis?
Solution
Because our critical value is larger than our calculated value, we retain the null hypothesis.
The debate rages on.
Exercise \(\PageIndex{2}\)
What would our results look like in the statistical sentence?
 Answer

\(\chi^2\)(1)=1.08, p>.05
WriteUp
How might we write this up? We can't quite fulfill the four requirements for reporting results because there are no means to include. Instead, let's include the Observed frequencies.
Example \(\PageIndex{7}\)
Report the results in a concluding paragraph that includes the four requirements (but use Observed frequencies instead of descriptive statistics).
Solution
The research hypothesis was that there would be a pattern of difference such that more people would dislike pineapples on pizza than like pineapples on pizza. This research hypothesis was not supported (\(\chi^2\)(1)=1.08, p>.05). There does not seem to be a pattern of difference; of our 45 participants, 19 people like pineapple on pizza, and 26 people dislike pineapple on pizza.
That's it! If you want more practice, check out this blog post about the frequency of the different colors of M&Ms.
We now move on to the other kind of ChiSquare analysis, the Test of Independence.
Contributors and Attributions
Foster et al. (University of MissouriSt. Louis, Rice University, & University of Houston, Downtown Campus)
