8.1: Two Proportion Z-Test and Confidence Interval
- Page ID
- 39774
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)This section will look at how to analyze a difference in the proportions for two independent samples. As with all other hypothesis tests and confidence intervals, the process of testing is the same, though the formulas and assumptions are different.
Three Types of Hypothesis Testing
There are three types of hypothesis tests for comparing the difference in 2 population proportions p1 – p2, see Figure 9-7.

Note that for our purposes, p1 – p2 = 0. We could also use a variant of this model to test for a magnitude difference for when p1 – p2 ≠ 0, but we will not cover that scenario.
Notation:
population 1 | population 2 |
---|---|
p1- population proportion | p2- population proportion |
n1- sample size | n2 - sample size |
x1 - number of successes | x2 - number of successes |
p'1 - sample proportion =\(\frac{x_{1}}{n_{1}}\) | p'2 - sample proportion = \(\frac{x_{2}}{n_{2}}\) |
q'1 - complement of p'1 | q'2 - complement of p'2 |
Two Proportions Z-Test
The z-test is a statistical test for comparing the proportions from two populations. It can be used when the samples are independent, \(n_{1} \hat{p}_{1}\) ≥ 10, \(n_{1} \hat{q}_{1}\) ≥ 10, \(n_{2} \hat{p}_{2}\) ≥ 10, and \(n_{2} \hat{q}_{2}\) ≥ 10.
The formula for the z-test statistic is:
\(z=\frac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\hat{p} \cdot \hat{q}\left(\frac{1}{n_{1}}+\frac{1}{n_{2}}\right)\right)}}\)
Where \(\hat{p}=\frac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\frac{\left(\hat{p}_{1} \cdot n_{1}+\hat{p}_{2} \cdot n_{2}\right)}{\left(n_{1}+n_{2}\right)}, \quad \hat{q}=1-\hat{p}, \quad \hat{p}_{1}=\frac{x_{1}}{n_{1}}, \hat{p}_{2}=\frac{x_{2}}{n_{2}}\).
The pooled proportion \(\hat{p}\) is a weighted mean of the proportions and \(\hat{q}\) is the complement of \(\hat{p}\). Some texts or software may use different notation for the pooled proportion, note that \(\hat{p}=\bar{p}\).
A vice principal wants to see if there is a difference between the number of students who are late to class for the first class of the day compared to the student’s class right after lunch. To test their claim to see if there is a difference in the proportion of late students between first and after lunch classes, the vice-principal randomly selects 200 students from first class and records if they are late, then randomly selects 200 students in their class after lunch and records if they are late. At the 0.05 level of significance, can a difference be concluded?
Population | First Class | After Lunch Class |
---|---|---|
Sample size | 200 | 200 |
Number of late students | 13 | 16 |
Solution
Assumptions: We are comparing the proportion of late students’ first and after lunch classes. The number of “successes” and “failures” from each population must be greater than 10 ( = 13 ≥ 10, = 187 ≥ 10, = 16 ≥ 10, and = 184 ≥ 10). We must assume that the samples were independent.
Using the Traditional Method The claim is that there is a difference between the proportion of late students. Let population 1 be the first class, and population 2 be the class after lunch. Our claim would then be p1 ≠ p2.
The correct hypotheses are:
H0: p1 = p2
H1: p1 ≠ p2.
Compute the \(z_{\alpha / 2}\) critical values. Draw and label the sampling distribution.
Use the inverse normal function invNorm(0.025,0,1) to get \(z_{\alpha / 2}\) = ±1.96. See Figure 9-8.

In order to compute the test statistic, we must first compute the following proportions:
\(\begin{array}{ll}
\hat{p}=\frac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\frac{(13+16)}{(200+200)}=0.0725 & \hat{q}=1-\hat{p}=1-0.0725=0.9275 \\
\hat{p}_{1}=\frac{x_{1}}{n_{1}}=\frac{13}{200}=0.065 & \hat{p}_{2}=\frac{x_{2}}{n_{2}}=\frac{16}{200}=0.08
\end{array}\)
The test statistic is, \(z=\frac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\hat{p} \cdot \hat{q}\left(\frac{1}{n_{1}}+\frac{1}{n_{2}}\right)\right)}}=\frac{(0.065-0.08)}{\sqrt{\left(0.0725 \cdot 0.9275\left(\frac{1}{200}+\frac{1}{200}\right)\right)}}=-0.5784\).
Decision: Because the test statistic is between the critical values, we do not reject H0.
Summary: There is not enough evidence to support any difference in the proportion of students that are late for their first class compared to the class after lunch.
TI-84: Press the [STAT] key, arrow over to the [TESTS] menu, arrow down to the option [6:2-PropZTest] and press the [ENTER] key. Type in the x1, n1, x2, and n2 arrow over to the \(\neq\), <, > sign that is the same in the problem’s alternative hypothesis statement, then press the [ENTER] key, arrow down to [Calculate] and press the [ENTER] key. The calculator returns the z-test statistic and the p-value.
Two Proportions Z-Interval
A 100(1 – \(\alpha\))% confidence interval for the difference between two population proportions p1 – p2:
\(\left(\hat{p}_{1}-\hat{p}_{2}\right)-z_{\alpha / 2} \sqrt{\left(\frac{\hat{p}_{1} \hat{q}_{1}}{n_{1}}+\frac{\hat{p}_{2} \hat{q}_{2}}{n_{2}}\right)}<p_{1}-p_{2}>
Or more compactly as \(\left(\hat{p}_{1}-\hat{p}_{2}\right) \pm z_{\alpha / 2} \sqrt{\left(\frac{\hat{p}_{1} \hat{q}_{1}}{n_{1}}+\frac{\hat{p}_{2} \hat{q}_{2}}{n_{2}}\right)}\)
The requirements are identical to the 2-proportion hypothesis test. Note that the standard error does not rely on a hypothesized proportion so do not use a confidence interval to make decisions based on a hypothesis statement.
Find the 95% confidence interval for the difference in the proportion of late students in their first class and the proportion who are late to their class after lunch.
Population | First Class | After Lunch Class |
---|---|---|
Sample size | 200 | 200 |
Number of late students | 13 | 16 |
Solution
First, compute the following:
\(\hat{p}_{1}=\frac{x_{1}}{n_{1}}=\frac{13}{200}=0.065 \quad \hat{q}_{1}=1-\hat{p}_{1}=1-0.065=0.935\)
\(\hat{p}_{2}=\frac{x_{2}}{n_{2}}=\frac{16}{200} \quad=0.08 \quad \hat{q}_{2}=1-\hat{p}_{2}=1-0.08=0.92\)
Find the \(z_{\alpha / 2}\) critical value. Use the inverse normal to get \(z_{\alpha / 2}\) = 1.96.
Now substitute the numbers into the interval estimate: \(\left(\hat{p}_{1}-\hat{p}_{2}\right) \pm z_{\frac{\alpha}{2}} \sqrt{\left(\frac{\hat{p}_{1} \hat{q}_{1}}{n_{1}}+\frac{\hat{p}_{2} \hat{q}_{2}}{n_{2}}\right)}\)
\(\begin{aligned}
&\Rightarrow(0.065-0.08) \pm 1.96 \sqrt{\left(\frac{0.065 \cdot 0.935}{200}+\frac{0.08 \cdot 0.92}{200}\right)} \\
&\Rightarrow \quad-0.015 \pm 0.0508 \\
&\Rightarrow \quad(-0.0508,0.0358) .
\end{aligned}\)
Use interval notation (–0.0508, 0.0358) or standard notation –0.0508 < p1 – p2 < 0.0358. Note that we can have negative numbers here since we are taking the difference of two proportions.
Since p1 – p2 = 0 is in the interval, we are 95% confident that there is no difference in the proportion of late students between their first class or those who are late for their class after lunch.
TI-84: Press the [STAT] key, arrow over to the [TESTS] menu, arrow down to the option [2-PropZInterval] and press the [ENTER] key. Type in the x1, n1, x2, n2, the confidence level, then press the [ENTER] key, arrow down to [Calculate] and press the [ENTER] key. The calculator returns the confidence interval.