11.E: Chi-Square Tests and F-Tests (Exercises)

Last updated
Save as PDF

Page ID: 1118

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$ \newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\id}{\mathrm{id}}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\kernel}{\mathrm{null}\,}$

$ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$

$ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$

$ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\AA}{\unicode[.8,0]{x212B}}$

$ \newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$ \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$ \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vectorC}[1]{\textbf{#1}} $

$ \newcommand{\vectorD}[1]{\overrightarrow{#1}} $

$ \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} $

$ \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} $

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$\newcommand{\avec}{\mathbf a}$ $\newcommand{\bvec}{\mathbf b}$ $\newcommand{\cvec}{\mathbf c}$ $\newcommand{\dvec}{\mathbf d}$ $\newcommand{\dtil}{\widetilde{\mathbf d}}$ $\newcommand{\evec}{\mathbf e}$ $\newcommand{\fvec}{\mathbf f}$ $\newcommand{\nvec}{\mathbf n}$ $\newcommand{\pvec}{\mathbf p}$ $\newcommand{\qvec}{\mathbf q}$ $\newcommand{\svec}{\mathbf s}$ $\newcommand{\tvec}{\mathbf t}$ $\newcommand{\uvec}{\mathbf u}$ $\newcommand{\vvec}{\mathbf v}$ $\newcommand{\wvec}{\mathbf w}$ $\newcommand{\xvec}{\mathbf x}$ $\newcommand{\yvec}{\mathbf y}$ $\newcommand{\zvec}{\mathbf z}$ $\newcommand{\rvec}{\mathbf r}$ $\newcommand{\mvec}{\mathbf m}$ $\newcommand{\zerovec}{\mathbf 0}$ $\newcommand{\onevec}{\mathbf 1}$ $\newcommand{\real}{\mathbb R}$ $\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$ $\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$ $\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$ $\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$ $\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$ $\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$ $\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$ $\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$ $\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$ $\newcommand{\laspan}[1]{\text{Span}\{#1\}}$ $\newcommand{\bcal}{\cal B}$ $\newcommand{\ccal}{\cal C}$ $\newcommand{\scal}{\cal S}$ $\newcommand{\wcal}{\cal W}$ $\newcommand{\ecal}{\cal E}$ $\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$ $\newcommand{\gray}[1]{\color{gray}{#1}}$ $\newcommand{\lgray}[1]{\color{lightgray}{#1}}$ $\newcommand{\rank}{\operatorname{rank}}$ $\newcommand{\row}{\text{Row}}$ $\newcommand{\col}{\text{Col}}$ $\renewcommand{\row}{\text{Row}}$ $\newcommand{\nul}{\text{Nul}}$ $\newcommand{\var}{\text{Var}}$ $\newcommand{\corr}{\text{corr}}$ $\newcommand{\len}[1]{\left|#1\right|}$ $\newcommand{\bbar}{\overline{\bvec}}$ $\newcommand{\bhat}{\widehat{\bvec}}$ $\newcommand{\bperp}{\bvec^\perp}$ $\newcommand{\xhat}{\widehat{\xvec}}$ $\newcommand{\vhat}{\widehat{\vvec}}$ $\newcommand{\uhat}{\widehat{\uvec}}$ $\newcommand{\what}{\widehat{\wvec}}$ $\newcommand{\Sighat}{\widehat{\Sigma}}$ $\newcommand{\lt}{<}$ $\newcommand{\gt}{>}$ $\newcommand{\amp}{&}$ $\definecolor{fillinmathshade}{gray}{0.9}$

These are homework exercises to accompany the Textmap created for "Introductory Statistics" by Shafer and Zhang.

11.1: Chi-Square Tests for Independence

Basic

Q11.1.1

Find $\chi _{0.01}^{2}$ for each of the following number of degrees of freedom.

$df=5$
$df=11$
$df=25$

Q11.1.2

Find $\chi _{0.05}^{2}$ for each of the following number of degrees of freedom.

$df=6$
$df=12$
$df=30$

Q11.1.3

Find $\chi _{0.10}^{2}$ for each of the following number of degrees of freedom.

$df=6$
$df=12$
$df=30$

Q11.1.4

Find $\chi _{0.01}^{2}$ for each of the following number of degrees of freedom.

$df=7$
$df=10$
$df=20$

Q11.1.5

For $df=7$ and $\alpha =0.05$

$\chi _{\alpha }^{2}$
$\chi _{\frac{\alpha }{2}}^{2}$

Q11.1.6

For $df=17$ and $\alpha =0.01$

$\chi _{\alpha }^{2}$
$\chi _{\frac{\alpha }{2}}^{2}$

Q11.1.7

A data sample is sorted into a $2 \times 2$ contingency table based on two factors, each of which has two levels.

		Factor 1
		Level 1 Level 2		Row Total
Factor 2	Level 1	$\begin{matrix} 20 \end{matrix}$	$\begin{matrix} 10 \end{matrix}$	R
Factor 2	Level 2	$\begin{matrix} 15 \end{matrix}$	5	R
Column Total		C	C	n

Find the column totals, the row totals, and the grand total, $n$, of the table.
Find the expected number $E$ of observations for each cell based on the assumption that the two factors are independent (that is, just use the formula $E=(R\times C)/n$).
Find the value of the chi-square test statistic $\chi ^2$.
Find the number of degrees of freedom of the chi-square test statistic.

Q11.1.8

A data sample is sorted into a $3 \times 2$ contingency table based on two factors, one of which has three levels and the other of which has two levels.

		Factor 1
		Level 1 Level 2		Row Total
Factor 2	Level 1	$\begin{matrix} 20 \end{matrix}$	$\begin{matrix} 10 \end{matrix}$	R
	Level 2	$\begin{matrix} 15 \end{matrix}$	5	R
	Level 3	$\begin{matrix} 10 \end{matrix}$	$\begin{matrix} 20 \end{matrix}$	R
Column Total		C	C	n

Find the column totals, the row totals, and the grand total, $n$, of the table.
Find the expected number $E$ of observations for each cell based on the assumption that the two factors are independent (that is, just use the formula $E=(R\times C)/n$).
Find the value of the chi-square test statistic $\chi ^2$.
Find the number of degrees of freedom of the chi-square test statistic.

Applications

Q11.1.9

A child psychologist believes that children perform better on tests when they are given perceived freedom of choice. To test this belief, the psychologist carried out an experiment in which $200$ third graders were randomly assigned to two groups, $A$ and $B$. Each child was given the same simple logic test. However in group $B$, each child was given the freedom to choose a text booklet from many with various drawings on the covers. The performance of each child was rated as Very Good, Good, and Fair. The results are summarized in the table provided. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to support the psychologist’s belief.

		Group
		A	B
Performance	Very Good	32	29
	Good	55	61
	Fair	10	13

Q11.1.10

In regard to wine tasting competitions, many experts claim that the first glass of wine served sets a reference taste and that a different reference wine may alter the relative ranking of the other wines in competition. To test this claim, three wines, $A$, $B$ and $C$, were served at a wine tasting event. Each person was served a single glass of each wine, but in different orders for different guests. At the close, each person was asked to name the best of the three. One hundred seventy-two people were at the event and their top picks are given in the table provided. Test, at the $1\%$ level of significance, whether there is sufficient evidence in the data to support the claim that wine experts’ preference is dependent on the first served wine.

		Top Pick
		A	B	C
First Glass	A	12	31	27
	B	15	40	21
	C	10	9	7

Is being left-handed hereditary? To answer this question, $250$ adults are randomly selected and their handedness and their parents’ handedness are noted. The results are summarized in the table provided. Test, at the $1\%$ level of significance, whether there is sufficient evidence in the data to conclude that there is a hereditary element in handedness.

		Number of Parents Left-Handed
		0	1	2
Handedness	Left	8	10	12
Handedness	Right	178	21	21

Some geneticists claim that the genes that determine left-handedness also govern development of the language centers of the brain. If this claim is true, then it would be reasonable to expect that left-handed people tend to have stronger language abilities. A study designed to text this claim randomly selected $807$ students who took the Graduate Record Examination (GRE). Their scores on the language portion of the examination were classified into three categories: low, average, and high, and their handedness was also noted. The results are given in the table provided. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to conclude that left-handed people tend to have stronger language abilities.

		GRE English Scores
		Low	Average	High
Handedness	Left	18	40	22
Handedness	Right	201	360	166

It is generally believed that children brought up in stable families tend to do well in school. To verify such a belief, a social scientist examined $290$ randomly selected students’ records in a public high school and noted each student’s family structure and academic status four years after entering high school. The data were then sorted into a $2 \times 3$ contingency table with two factors. $\text{Factor 1}$ has two levels: graduated and did not graduate. $\text{Factor 2}$ has three levels: no parent, one parent, and two parents. The results are given in the table provided. Test, at the $1\%$ level of significance, whether there is sufficient evidence in the data to conclude that family structure matters in school performance of the students.

		Academic Status
		Graduated	Did Not Graduate
Family	No parent	18	31
	One parent	101	44
	Two parents	70	26

A large middle school administrator wishes to use celebrity influence to encourage students to make healthier choices in the school cafeteria. The cafeteria is situated at the center of an open space. Everyday at lunch time students get their lunch and a drink in three separate lines leading to three separate serving stations. As an experiment, the school administrator displayed a poster of a popular teen pop star drinking milk at each of the three areas where drinks are provided, except the milk in the poster is different at each location: one shows white milk, one shows strawberry-flavored pink milk, and one shows chocolate milk. After the first day of the experiment the administrator noted the students’ milk choices separately for the three lines. The data are given in the table provided. Test, at the $1\%$ level of significance, whether there is sufficient evidence in the data to conclude that the posters had some impact on the students’ drink choices.

	Student Choice
	Regular	Strawberry	Chocolate
Poster Choice
Regular	38	28	40
Strawberry	18	51	24
Chocolate	32	32	53

Large Data Set Exercise

Large Data Sets not available

Large $\text{Data Set 8}$ records the result of a survey of $300$ randomly selected adults who go to movie theaters regularly. For each person the gender and preferred type of movie were recorded. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to conclude that the factors “gender” and “preferred type of movie” are dependent.

Answers

1. $15.09$
2. $24.72$
3. $44.31$
1. $10.64$
2. $18.55$
3. $40.26$
1. $14.07$
2. $16.01$
1. $C_1=35,\; C_2=15,\; R_1=30,\; R_2=20,\; n=50$
2. $E_{11}=21,\; E_{12}=9,\; E_{21}=14,\; E_{22}=6$
3. $\chi ^2=0.3968$
4. $df=1$
$\chi ^2=0.6698,\; \chi _{0.05}^{2}=5.99$, do not reject $H_0$
$\chi ^2=72.35,\; \chi _{0.01}^{2}=9.21$, reject $H_0$
$\chi ^2=21.2784,\; \chi _{0.01}^{2}=9.21$, reject $H_0$
$\chi ^2=28.4539$, $df=3$, Rejection Region: $[7.815,\infty )$, Decision: reject $H_0$ of independence

11.2: Chi-Square One-Sample Goodness-of-Fit Tests

Basic

A data sample is sorted into five categories with an assumed probability distribution.

Factor Levels	Assumed Distribution	Observed Frequency
1	$p_1=0.1$	10
2	$p_2=0.4$	35
3	$p_3=0.4$	45
4	$p_4=0.1$	10

Find the size $n$ of the sample.
Find the expected number $E$ of observations for each level, if the sampled population has a probability distribution as assumed (that is, just use the formula $E_i=n\times p_i$).
Find the chi-square test statistic $\chi ^2$.
Find the number of degrees of freedom of the chi-square test statistic.

A data sample is sorted into five categories with an assumed probability distribution.

Factor Levels	Assumed Distribution	Observed Frequency
1	$p_1=0.3$	23
2	$p_2=0.3$	30
3	$p_3=0.2$	19
4	$p_4=0.1$	8
5	$p_5=0.1$	10

Find the size $n$ of the sample.
Find the expected number $E$ of observations for each level, if the sampled population has a probability distribution as assumed (that is, just use the formula $E_i=n\times p_i$).
Find the chi-square test statistic $\chi ^2$.
Find the number of degrees of freedom of the chi-square test statistic.

Applications

Retailers of collectible postage stamps often buy their stamps in large quantities by weight at auctions. The prices the retailers are willing to pay depend on how old the postage stamps are. Many collectible postage stamps at auctions are described by the proportions of stamps issued at various periods in the past. Generally the older the stamps the higher the value. At one particular auction, a lot of collectible stamps is advertised to have the age distribution given in the table provided. A retail buyer took a sample of $73$ stamps from the lot and sorted them by age. The results are given in the table provided. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to conclude that the age distribution of the lot is different from what was claimed by the seller.

Year	Claimed Distribution	Observed Frequency
Before 1940	0.10	6
1940 to 1959	0.25	15
1960 to 1979	0.45	30
After 1979	0.20	22

The litter size of Bengal tigers is typically two or three cubs, but it can vary between one and four. Based on long-term observations, the litter size of Bengal tigers in the wild has the distribution given in the table provided. A zoologist believes that Bengal tigers in captivity tend to have different (possibly smaller) litter sizes from those in the wild. To verify this belief, the zoologist searched all data sources and found $316$ litter size records of Bengal tigers in captivity. The results are given in the table provided. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to conclude that the distribution of litter sizes in captivity differs from that in the wild.

Litter Size	Wild Litter Distribution	Observed Frequency
1	0.11	41
2	0.69	243
3	0.18	27
4	0.02	5

An online shoe retailer sells men’s shoes in sizes $8$ to $13$. In the past orders for the different shoe sizes have followed the distribution given in the table provided. The management believes that recent marketing efforts may have expanded their customer base and, as a result, there may be a shift in the size distribution for future orders. To have a better understanding of its future sales, the shoe seller examined $1,040$ sales records of recent orders and noted the sizes of the shoes ordered. The results are given in the table provided. Test, at the $1\%$ level of significance, whether there is sufficient evidence in the data to conclude that the shoe size distribution of future sales will differ from the historic one.

Shoe Size	Past Size Distribution	Recent Size Frequency
8.0	0.03	25
8.5	0.06	43
9.0	0.09	88
9.5	0.19	221
10.0	0.23	272
10.5	0.14	150
11.0	0.10	107
11.5	0.06	51
12.0	0.05	37
12.5	0.03	35
13.0	0.02	11

An online shoe retailer sells women’s shoes in sizes $5$ to $10$. In the past orders for the different shoe sizes have followed the distribution given in the table provided. The management believes that recent marketing efforts may have expanded their customer base and, as a result, there may be a shift in the size distribution for future orders. To have a better understanding of its future sales, the shoe seller examined $1,174$ sales records of recent orders and noted the sizes of the shoes ordered. The results are given in the table provided. Test, at the $1\%$ level of significance, whether there is sufficient evidence in the data to conclude that the shoe size distribution of future sales will differ from the historic one.

Shoe Size	Past Size Distribution	Recent Size Frequency
5.0	0.02	20
5.5	0.03	23
6.0	0.07	88
6.5	0.08	90
7.0	0.20	222
7.5	0.20	258
8.0	0.15	177
8.5	0.11	121
9.0	0.08	91
9.5	0.04	53
10.0	0.02	31

A chess opening is a sequence of moves at the beginning of a chess game. There are many well-studied named openings in chess literature. French Defense is one of the most popular openings for black, although it is considered a relatively weak opening since it gives black probability $0.344$ of winning, probability $0.405$ of losing, and probability $0.251$ of drawing. A chess master believes that he has discovered a new variation of French Defense that may alter the probability distribution of the outcome of the game. In his many Internet chess games in the last two years, he was able to apply the new variation in $77$ games. The wins, losses, and draws in the $77$ games are given in the table provided. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to conclude that the newly discovered variation of French Defense alters the probability distribution of the result of the game.

Result for Black	Probability Distribution	New Variation Wins
Win	0.344	31
Loss	0.405	25
Draw	0.251	21

The Department of Parks and Wildlife stocks a large lake with fish every six years. It is determined that a healthy diversity of fish in the lake should consist of $10\%$ largemouth bass, $15\%$ smallmouth bass, $10\%$ striped bass, $10\%$ trout, and $20\%$ catfish. Therefore each time the lake is stocked, the fish population in the lake is restored to maintain that particular distribution. Every three years, the department conducts a study to see whether the distribution of the fish in the lake has shifted away from the target proportions. In one particular year, a research group from the department observed a sample of $292$ fish from the lake with the results given in the table provided. Test, at the $5\%$ level of significance, whether there is sufficient evidence in the data to conclude that the fish population distribution has shifted since the last stocking.

Fish	Target Distribution	Fish in Sample
Largemouth Bass	0.10	14
Smallmouth Bass	0.15	49
Striped Bass	0.10	21
Trout	0.10	22
Catfish	0.20	75
Other	0.35	111

Large Data Set Exercise

Large Data Sets not available

Large $\text{Data Set 4}$ records the result of $500$ tosses of six-sided die. Test, at the $10\%$ level of significance, whether there is sufficient evidence in the data to conclude that the die is not “fair” (or “balanced”), that is, that the probability distribution differs from probability $1/6$ for each of the six faces on the die.

Answers

S11.2.1

$n=100$
$E=10,E=40,E=40,E=10$
$\chi^2=1.25$
$df=3$

S11.2.3

$\chi ^2=4.8082,\; \chi _{0.05}^{2}=7.81,\; \text{do not reject } H_0$

S11.2.5

$\chi ^2=26.5765,\; \chi _{0.01}^{2}=23.21,\; \text{reject } H_0$

S11.2.7

$\chi ^2=2.1401,\; \chi _{0.05}^{2}=5.99,\; \text{do not reject } H_0$

S11.2.9

$\chi ^2=2.944,\; df=5,\; \text{Rejection Region: }[9.236,\infty ),\; \text{Decision: Fail to reject }H_0\; \text{of balance}$