9.3: z-Test for the Difference Between Two Proportions
- Page ID
- 58301
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Understand how to perform hypothesis testing for the difference between two proportions using z-values.
- Recognize that this method is appropriate for comparing proportions from two independent groups with large sample sizes.
- Determine whether the difference between the two population proportions is statistically significant.
- Apply the normal approximation for large samples.
- Use the standard normal distribution (z-distribution) to make decisions based on the test statistic.
This section will examine how to analyze a difference in the proportions for two independent samples. As with all other hypothesis tests and confidence intervals, the testing process is the same, although the formulas and assumptions differ. In this section, we compare the differences between 2 population proportions labeled \(p_1\) and \(p_2\). The two samples used must be random and independent. Also, to ensure normality, the following inequalities must be valid.
- \(n_1 \cdot p_1 \geq 5\)
- \(n_1 \cdot (1 - p_1) \geq 5\)
- \(n_2 \cdot p_2 \geq 5\)
- \(n_2 \cdot (1 - p_2) \geq 5\)
There are three types of hypothesis tests for comparing the difference in 2 population proportions p1 – p2, see Figure 9-7.
Hypothesis Test for the Difference of Two Means
- State the claim using the hypotheses presented in Figure 9-7.
- Draw the probability distribution and shade in the tail area using the proper number of tails. Use Figure 9-1 as a reference.
- Look up the critical values using the table or other technology.
- To compute the test statistic (or test point) for a hypothesis test involving two proportions, you can use the formula listed in the formula box below or technology.
- Decide on the null hypothesis (\(H_0\)) and summarize the results.
\(z=\dfrac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\bar{p} \cdot \bar{q}\left(\dfrac{1}{n_{1}}+\dfrac{1}{n_{2}}\right)\right)}}\)
Where,
- \(\hat{p}_1 =\) sample proportion for sample 1, calculated as \(\hat{p}_1 = \dfrac{X_1}{n_1}\)
- \(\hat{p}_2 =\) sample proportion for sample 2, calculated as \(\hat{p}_2 = \dfrac{X_2}{n_2}\)
- \(p_1\) = the proportion of population 1.
- \(p_2\) = the proportion of population 2.
- Note that for our purposes, p1 – p2 = 0.
- \(\bar{p}\) = pooled proportion and is computed as \(\bar{p}\) = \(\dfrac{X_1+X_2}{n_1+n_2}\)
- \(X_1\) = number of successes in sample 1.
- \(X_2\) = number of successes in sample 2.
- \(n_1\) = sample size of population 1.
- \(n_2\) = sample size of population 2.
- \(\bar{q}\) = 1 \(- \bar{p}\)
Examples
At a university study lounge, a rumor spreads that social science majors spend less time in group study sessions than STEM majors. To investigate, a curious researcher surveys students across campus.
Out of 120 surveyed social science majors, 54 report attending a weekly study group.
Out of 100 surveyed STEM majors, 55 report the same.
Using \(\alpha = 0.05\), test the claim that the proportion of social science majors attending weekly study groups is less than STEM majors.
Solution
- The claim and hypotheses are written as follows.
\(H_0: p_1 = p_2\)
\(H_1: p_1 < p_2\) Claim
This is a one-tailed test.
- The critical value is found using the table and \(\alpha = 0.05\). It is \(-1.65\).
- Compute the test point.
\(\hat{p}_{1}=\dfrac{X_{1}}{n_{1}}=\dfrac{54}{120}=0.45\)
\(\hat{p}_{2}=\dfrac{X_{2}}{n_{2}}=\dfrac{55}{100}=0.55\)
\(\hat{p}=\dfrac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\dfrac{(54+55)}{(120+100)}=0.4955\)
\(\hat{q}=1-\hat{p}=1-0.4955=0.5045\)
The test statistic is, \(z=\dfrac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\bar{p} \cdot \bar{q}\left(\dfrac{1}{n_{1}}+\dfrac{1}{n_{2}}\right)\right)}}=\dfrac{(0.45-0.55)}{\sqrt{\left(0.4955 \cdot 0.5045\left(\dfrac{1}{120}+\dfrac{1}{100}\right)\right)}}= -1.48\).
- The result is to " not reject \(H_0\)" as the test point does not fall in the critical region.
- The summary "there is not enough evidence to support the claim that social science majors spend less time in study groups than STEM majors."
A marketing researcher is testing the claim of whether there is a significant difference between the proportion of burger lovers and pizza lovers who attend a local community college. To test the claim, the researcher uses \(\alpha = 0.10\). Two random and independent samples were collected, and the results are presented in the table below.
| Burger Lovers | Pizza Lovers |
|---|---|
| \(X_1=90\) | \(X_2=100\) |
| \(n_1=150\) | \(n_2=200\) |
Table \(\PageIndex{1}\): Data for Burger Lovers and Pizza Lovers
Solution
- Since the word "difference" translates to "not equal", the claim and hypotheses are written as follows.
\(H_0: p_1 = p_2\)
\(H_1: p_1 \neq p_2\) Claim
This is a two-tailed test.
- The critical values are found using the table and the level of significance (0.01). They are \(\pm 1.65\).
- Compute the test point.
\(\hat{p}_{1}=\dfrac{X_{1}}{n_{1}}=\dfrac{90}{150}=0.6\)
\(\hat{p}_{2}=\dfrac{X_{2}}{n_{2}}=\dfrac{100}{200}=0.5\)
\(\hat{p}=\dfrac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\dfrac{(90+100)}{(150+200)}=0.543\)
\(\hat{q}=1-\hat{p}=1-0.543=0.457\)
The test statistic is, \(z=\dfrac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\bar{p} \cdot \bar{q}\left(\dfrac{1}{n_{1}}+\dfrac{1}{n_{2}}\right)\right)}}=\dfrac{(0.6-0.5)}{\sqrt{\left(0.543 \cdot 0.457\left(\dfrac{1}{150}+\dfrac{1}{200}\right)\right)}}= 1.86\).
- The result is to "reject \(H_0\)" as the test point falls in the critical region.
- The summary is "there is enough evidence to support the claim."
A vice principal wants to see if there is a difference between the number of students who are late to class for the first class of the day compared to the students’ class right after lunch. To test their claim to see if there is a difference in the proportion of late students between first and after-lunch classes, the vice-principal randomly selects 200 students from the first class and records if they are late, then randomly selects 200 students from their class after lunch and records. The number of students who were late to the first class was 13, and the number of students who were late to class after lunch was 16. At \(\alpha = 0.05\), can a difference be concluded?
Solution
- Since the word "difference" translates to "not equal", the claim and hypotheses are written as follows.
\(H_0: p_1 = p_2\)
\(H_1: p_1 \neq p_2\) Claim
This is a two-tailed test.
- The critical values are found using the table and the level of significance (0.05). They are \(\pm 1.96\).
- Compute the test point.
\(\hat{p}_{1}=\dfrac{X_{1}}{n_{1}}=\dfrac{13}{200}=0.065\)
\(\hat{p}_{2}=\dfrac{X_{2}}{n_{2}}=\dfrac{16}{200}=0.08\)
\(\hat{p}=\dfrac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\dfrac{(13+16)}{(200+200)}=0.0725\)
\(\hat{q}=1-\hat{p}=1-0.0725=0.9275\)
The test statistic is, \(z=\dfrac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\bar{p} \cdot \bar{q}\left(\dfrac{1}{n_{1}}+\dfrac{1}{n_{2}}\right)\right)}}=\dfrac{(0.065-0.08)}{\sqrt{\left(0.0725 \cdot 0.9275\left(\dfrac{1}{200}+\dfrac{1}{200}\right)\right)}}=-0.5784 = -0.58\).
TI-84: Press the [STAT] key, arrow over to the [TESTS] menu, arrow down to the option [6:2-PropZTest] and press the [ENTER] key. Type in the x1, n1, x2, and n2 arrow over to the \(\neq\), <, > sign that is the same in the problem’s alternative hypothesis statement, then press the [ENTER] key, arrow down to [Calculate] and press the [ENTER] key. The calculator returns the z-test statistic and the p-value.
- The result is to "not reject \(H_0\) as the test point falls in the non-critical region.
- The summary is "there is not enough evidence to support the claim."
Authors
"9.3: z-Test for the Difference Between Two Proportions" by Toros Berberyan, Tracy Nguyen, and Alfie Swan is licensed under CC BY-SA 4.0
Attributions
"9.1: Two Proportions" by Kathryn Kozak is licensed under CC BY-SA 4.0
"9.3: Two Proportion Z-Test and Confidence Interval" by Rachel Webb is licensed CC BY-SA 4.0
Exercises
- A health research team is investigating whether smoking is associated with high blood pressure, but they want to approach it with data rather than assumptions. They randomly surveyed two groups of adults—smokers and non-smokers—to see if there's a significant difference in the proportion of people with high blood pressure. Out of 150 smokers surveyed, 63 were found to have high blood pressure. Of the 180 non-smokers surveyed, 60 had high blood pressure. Using \( \alpha = 0.01 \), test whether there is a difference in the proportion of smokers and non-smokers who have high blood pressure. Use the traditional method.
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- A health research team is investigating whether smoking is associated with high blood pressure, but they want to approach it with data rather than assumptions. They randomly surveyed two groups of adults—smokers and non-smokers—to see if there's a significant difference in the proportion of people with high blood pressure. Out of 150 smokers surveyed, 63 were found to have high blood pressure. Of the 180 non-smokers surveyed, 60 had high blood pressure. Using \( \alpha = 0.01 \), test whether there is a difference in the proportion of smokers and non-smokers who have high blood pressure. Use the p-value method.
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- At a local community college, the campus gym staff is curious about workout habits across genders. A recent wellness challenge sparked debate, with some saying that females are more consistent about daily exercise than males. A health and fitness researcher decides to test this claim with data. In a survey of 130 male students, 58 reported exercising daily. In a survey of 140 female students, 75 reported daily exercise habits. Using \(\alpha = 0.01\), test the claim that the proportion of males who exercise daily is lower when compared to females. Use the traditional method
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- At a local community college, the campus gym staff is curious about workout habits across genders. A recent wellness challenge sparked debate, with some saying that females are more consistent about daily exercise than males. A health and fitness researcher decides to test this claim with data. In a survey of 130 male students, 58 reported exercising daily. In a survey of 140 female students, 75 reported daily exercise habits. Using \(\alpha = 0.01\), test the claim that the proportion of males who exercise daily is lower when compared to females. Use the p-value method
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- At ShopSmart Research, analysts are exploring whether the digital age is changing customer satisfaction. With more people turning to online shopping, some claim it's not just convenient—it leaves people more satisfied. Curious to test this claim, a marketing team surveys recent customers. Out of 180 online shoppers, 144 report being satisfied with their purchase. Out of 160 in-store shoppers, 116 report satisfaction. Using a significance level of \(\alpha = 0.025\), test the claim that a greater proportion of online shoppers are satisfied with their purchase compared to in-store shoppers. Use the traditional method.
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- At ShopSmart Research, analysts are exploring whether the digital age is changing customer satisfaction. With more people turning to online shopping, some claim it's not just convenient—it leaves people more satisfied. Curious to test this claim, a marketing team surveys recent customers. Out of 180 online shoppers, 144 report being satisfied with their purchase. Out of 160 in-store shoppers, 116 report satisfaction. Using a significance level of \(\alpha = 0.025\), test the claim that a greater proportion of online shoppers are satisfied with their purchase compared to in-store shoppers. Use the p-value method.
Scan the QR code or click on it to open the MyOpenMath version of the above question with step-by-step guidance.
MyOpenMath is a free online learning platform designed to support math instruction through automated homework, quizzes, and assessments. You must register for MyOpenMath and sign in to view the question.
- Answers
-
If you are an instructor and want the solutions to all the exercise questions for each section, please email Toros Berberyan.








