Skip to main content
Statistics LibreTexts

8.1: Two Proportion Z-Test and Confidence Interval

  • Page ID
    39774
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    This section will look at how to analyze a difference in the proportions for two independent samples. As with all other hypothesis tests and confidence intervals, the process of testing is the same, though the formulas and assumptions are different.

    Three Types of Hypothesis Testing

    There are three types of hypothesis tests for comparing the difference in 2 population proportions p1p2, see Figure 9-7.

    Left, right and two tailed normal graphs to show direction.
    Figure 9-7

    Note that for our purposes, p1p2 = 0. We could also use a variant of this model to test for a magnitude difference for when p1p2 ≠ 0, but we will not cover that scenario.

    Notation:

     

    Notation for Proportions
    population 1 population 2
    p1- population proportion p2- population proportion
    n1- sample size  n2 - sample size
    x1 - number of successes x2 - number of successes
    p'1 - sample proportion =\(\frac{x_{1}}{n_{1}}\) p'2 - sample proportion = \(\frac{x_{2}}{n_{2}}\)
    q'1 - complement of p'1 q'2 - complement of p'2

     

    Two Proportions Z-Test

    Definition: z-Test

    The z-test is a statistical test for comparing the proportions from two populations. It can be used when the samples are independent, \(n_{1} \hat{p}_{1}\) ≥ 10, \(n_{1} \hat{q}_{1}\) ≥ 10, \(n_{2} \hat{p}_{2}\) ≥ 10, and \(n_{2} \hat{q}_{2}\) ≥ 10.

    The formula for the z-test statistic is:

    \(z=\frac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\hat{p} \cdot \hat{q}\left(\frac{1}{n_{1}}+\frac{1}{n_{2}}\right)\right)}}\)

    Where \(\hat{p}=\frac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\frac{\left(\hat{p}_{1} \cdot n_{1}+\hat{p}_{2} \cdot n_{2}\right)}{\left(n_{1}+n_{2}\right)}, \quad \hat{q}=1-\hat{p}, \quad \hat{p}_{1}=\frac{x_{1}}{n_{1}}, \hat{p}_{2}=\frac{x_{2}}{n_{2}}\).

    The pooled proportion \(\hat{p}\) is a weighted mean of the proportions and \(\hat{q}\) is the complement of \(\hat{p}\). Some texts or software may use different notation for the pooled proportion, note that \(\hat{p}=\bar{p}\).

    Example \(\PageIndex{1}\)

    A vice principal wants to see if there is a difference between the number of students who are late to class for the first class of the day compared to the student’s class right after lunch. To test their claim to see if there is a difference in the proportion of late students between first and after lunch classes, the vice-principal randomly selects 200 students from first class and records if they are late, then randomly selects 200 students in their class after lunch and records if they are late. At the 0.05 level of significance, can a difference be concluded?

    Late Students
    Population First Class  After Lunch Class
    Sample size 200 200
    Number of late students 13 16
    Solution

    Assumptions: We are comparing the proportion of late students’ first and after lunch classes. The number of “successes” and “failures” from each population must be greater than 10 ( = 13 ≥ 10, = 187 ≥ 10, = 16 ≥ 10, and = 184 ≥ 10). We must assume that the samples were independent.

    Using the Traditional Method The claim is that there is a difference between the proportion of late students. Let population 1 be the first class, and population 2 be the class after lunch. Our claim would then be p1p2.

    The correct hypotheses are:

    H0: p1 = p2

    H1: p1p2.

    Compute the \(z_{\alpha / 2}\) critical values. Draw and label the sampling distribution.

    Use the inverse normal function invNorm(0.025,0,1) to get \(z_{\alpha / 2}\) = ±1.96. See Figure 9-8.

    clipboard_ec2878f0b48139ddc7c0d2edcbcfdd484.png
    Figure 9-8

    In order to compute the test statistic, we must first compute the following proportions:

    \(\begin{array}{ll}
    \hat{p}=\frac{\left(x_{1}+x_{2}\right)}{\left(n_{1}+n_{2}\right)}=\frac{(13+16)}{(200+200)}=0.0725 & \hat{q}=1-\hat{p}=1-0.0725=0.9275 \\
    \hat{p}_{1}=\frac{x_{1}}{n_{1}}=\frac{13}{200}=0.065 & \hat{p}_{2}=\frac{x_{2}}{n_{2}}=\frac{16}{200}=0.08
    \end{array}\)

    The test statistic is, \(z=\frac{\left(\hat{p}_{1}-\hat{p}_{2}\right)-\left(p_{1}-p_{2}\right)}{\sqrt{\left(\hat{p} \cdot \hat{q}\left(\frac{1}{n_{1}}+\frac{1}{n_{2}}\right)\right)}}=\frac{(0.065-0.08)}{\sqrt{\left(0.0725 \cdot 0.9275\left(\frac{1}{200}+\frac{1}{200}\right)\right)}}=-0.5784\).

    Decision: Because the test statistic is between the critical values, we do not reject H0.

    Summary: There is not enough evidence to support any difference in the proportion of students that are late for their first class compared to the class after lunch.

    TI-84: Press the [STAT] key, arrow over to the [TESTS] menu, arrow down to the option [6:2-PropZTest] and press the [ENTER] key. Type in the x1, n1, x2, and n2 arrow over to the \(\neq\), <, > sign that is the same in the problem’s alternative hypothesis statement, then press the [ENTER] key, arrow down to [Calculate] and press the [ENTER] key. The calculator returns the z-test statistic and the p-value.

    clipboard_ee9c37fdafbd1e1e0d553af8d0a29455c.png

    Two Proportions Z-Interval

    Definition: z-Interval

    A 100(1 – \(\alpha\))% confidence interval for the difference between two population proportions p1 – p2:

    \(\left(\hat{p}_{1}-\hat{p}_{2}\right)-z_{\alpha / 2} \sqrt{\left(\frac{\hat{p}_{1} \hat{q}_{1}}{n_{1}}+\frac{\hat{p}_{2} \hat{q}_{2}}{n_{2}}\right)}<p_{1}-p_{2}>

    Or more compactly as \(\left(\hat{p}_{1}-\hat{p}_{2}\right) \pm z_{\alpha / 2} \sqrt{\left(\frac{\hat{p}_{1} \hat{q}_{1}}{n_{1}}+\frac{\hat{p}_{2} \hat{q}_{2}}{n_{2}}\right)}\)

    The requirements are identical to the 2-proportion hypothesis test. Note that the standard error does not rely on a hypothesized proportion so do not use a confidence interval to make decisions based on a hypothesis statement.

     

    Example \(\PageIndex{2}\)

    Find the 95% confidence interval for the difference in the proportion of late students in their first class and the proportion who are late to their class after lunch.

     

    Population First Class  After Lunch Class
    Sample size 200 200
    Number of late students 13 16
    Solution

    First, compute the following:

    \(\hat{p}_{1}=\frac{x_{1}}{n_{1}}=\frac{13}{200}=0.065 \quad \hat{q}_{1}=1-\hat{p}_{1}=1-0.065=0.935\)

    \(\hat{p}_{2}=\frac{x_{2}}{n_{2}}=\frac{16}{200} \quad=0.08 \quad \hat{q}_{2}=1-\hat{p}_{2}=1-0.08=0.92\)

    Find the \(z_{\alpha / 2}\) critical value. Use the inverse normal to get \(z_{\alpha / 2}\) = 1.96.

    Now substitute the numbers into the interval estimate: \(\left(\hat{p}_{1}-\hat{p}_{2}\right) \pm z_{\frac{\alpha}{2}} \sqrt{\left(\frac{\hat{p}_{1} \hat{q}_{1}}{n_{1}}+\frac{\hat{p}_{2} \hat{q}_{2}}{n_{2}}\right)}\)

    \(\begin{aligned}
    &\Rightarrow(0.065-0.08) \pm 1.96 \sqrt{\left(\frac{0.065 \cdot 0.935}{200}+\frac{0.08 \cdot 0.92}{200}\right)} \\
    &\Rightarrow \quad-0.015 \pm 0.0508 \\
    &\Rightarrow \quad(-0.0508,0.0358) .
    \end{aligned}\)

    Use interval notation (–0.0508, 0.0358) or standard notation –0.0508 < p1p2 < 0.0358. Note that we can have negative numbers here since we are taking the difference of two proportions.

    Since p1 – p2 = 0 is in the interval, we are 95% confident that there is no difference in the proportion of late students between their first class or those who are late for their class after lunch.

    TI-84: Press the [STAT] key, arrow over to the [TESTS] menu, arrow down to the option [2-PropZInterval] and press the [ENTER] key. Type in the x1, n1, x2, n2, the confidence level, then press the [ENTER] key, arrow down to [Calculate] and press the [ENTER] key. The calculator returns the confidence interval.

    clipboard_e6e08e0d967d465c70b7895cd042929c7.png


    This page titled 8.1: Two Proportion Z-Test and Confidence Interval is shared under a CC BY-SA 1.0 license and was authored, remixed, and/or curated by Rachel Webb via source content that was edited to the style and standards of the LibreTexts platform.