Skip to main content
Statistics LibreTexts

9.4: Confidence Intervals for the Difference Between Two Means and Proportions

  • Page ID
    52703
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    Learning Objectives
    • Understand how to construct confidence intervals for the difference between two means or two proportions.
    • Use confidence intervals to estimate the range where the true difference between two populations likely lies.
    • Apply this method when comparing two independent groups.
    • Assess whether a meaningful difference exists without performing a formal hypothesis test.
    • Interpret the confidence interval to evaluate the significance of the difference between groups.

    In statistical analysis, confidence intervals are crucial in estimating unknown population parameters based on sample data. A confidence interval provides a range of plausible values within which the true parameter value is expected to lie, given a specified confidence level, such as 90%, 95%, or 99%. These intervals not only account for variability due to sampling but also reflect the uncertainty inherent in any estimation process.

    This section presents two examples of confidence intervals for estimating the difference between two population parameters. The first type illustrates a confidence interval for the difference between two population means based on a normally distributed sample and t-distribution. The second type demonstrates a confidence interval for the difference of two proportions. The two types of confidence intervals highlight the methodology, assumptions, and interpretation of the resulting confidence intervals, providing insights into their application in real-world problems. Moreover, it is assumed that the two samples are collected randomly from two populations and are independent samples.

    By constructing these confidence intervals, we gain a robust framework for statistical inference that allows researchers and decision-makers to make data-driven conclusions with a quantifiable level of confidence.

    Confidence Interval for the Difference of Two Means using Z-values

    This formula is used when both population standard deviations \(\sigma_1\) and \(\sigma_2\) are given or both sample sizes \(n_1\) and \(n_1\) are greater than or equal to 30.

    Definition: Confidence Interval Formula for the the Difference of 2 means using Z-values

    \((\bar{X}_1-\bar{X}_2)-Z_C\cdot\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}\)\(<\mu_1 - \mu_2<\)\((\bar{X}_1+\bar{X}_2)-Z_C\cdot\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}\)

    Where,

    • \(\bar{X}_1\) = sample mean of the first sample.
    • \(\bar{X}_2\) = sample mean of the second sample.
    • \(\sigma_1\) = population standard deviation of the first population.
    • \(\sigma_2\) = population standard deviation of the second population.
    • \(n_1\) = sample size of the first sample.
    • \(n_2\) = sample size of the second sample.
    • \(Z_C\) = critical value from the standard normal distribution for a given confidence level.
    Example \(\PageIndex{1}\)

    An educational researcher is investigating whether there is a significant difference in the academic performance of male and female students at a university. To measure academic performance, the researcher compares the grade point averages (GPAs) of female and male students. The data collected is provided in the table below. Also, the researcher uses a confidence level of 95%. Compute the confidence interval for the difference between the two means. Round all calculations to three decimal places.

    Females Males
    \(\bar{X}_1\) =3.15 \(\bar{X}_2\) =2.78
    \(\sigma_1\) =0.52 \(\sigma_2\) =0.48
    \(n_1\) =120 \(n_2\) =100
    Solution

    Step 1) Look up the critical value in a table or using a calculator. The value is \(Z_C\) = 1.96.

    Step 2) Plug information into the formula and calculate using the order of operations.

    \((\bar{X}_1-\bar{X}_2)-Z_C\cdot\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}\)\(<\mu_1 - \mu_2<\)\((\bar{X}_1+\bar{X}_2)-Z_C\cdot\sqrt{\dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2}}\)

    \((3.15-2.78)-1.96\cdot\sqrt{\dfrac{0.52^2}{120} + \dfrac{0.48^2}{100}}\)\(<\mu_1 - \mu_2<\)\((3.15-2.78)+1.96\cdot\sqrt{\dfrac{0.52^2}{120} + \dfrac{0.48^2}{100}}\)

    \(0.37-0.132\)\(<\mu_1 - \mu_2<\)\(0.37+0.132\)

    \(0.238\)\(<\mu_1 - \mu_2<\)\(0.502\)

    Since the difference does not contain zero, the average GPA for women is significantly higher than the average GPA for men.

    The TI-84+ can be used to compute the confidence interval as well.

    Step 1) Click on the "stat" button and use the right arrow to select "TESTS." Select option 9 using the down arrow or typing in 9. This option has the title "2-SampZint."

    Step 2) Under "2-SampZint," select "Stats." Type in the given data and select "Calculate."

    Step 3) The output is at the top of the screen. Round the two answers to three decimal places. They are written in parentheses and are separated by a comma. After rounding the confidence interval is (0.237, 0.502).

    The image below captures all three steps.

    fig-ch01_patchfile_01.jpg
    Figure \(\PageIndex{1}\): Copy and Paste Caption here. (Copyright; author via source)

    Confidence Interval for the Difference of Two Means using t-values

    This formula is used when both sample standard deviations \(s_1\) and \(s_2\) and both sample sizes \(n_1\) and \(n_1\) are less than 30.

    Definition: Confidence Interval for the Difference of Two Means using t-values

    \((\bar{X}_1-\bar{X}_2)-t_C\cdot\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}\)\(<\mu_1 - \mu_2<\)\((\bar{X}_1+\bar{X}_2)-t_C\cdot\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}\)

    Where,

    • \(\bar{X}_1\) = sample mean of the first sample.
    • \(\bar{X}_2\) = sample mean of the second sample.
    • \(s_1\) = sample standard deviation of the first sample.
    • \(s_2\) = sample standard deviation of the second sample.
    • \(n_1\) = sample size of the first sample.
    • \(n_2\) = sample size of the second sample.
    • \(t_C\) = critical value from the t-distribution for a given confidence level and minimum degrees of freedom.
    • The degrees of freedom is the minimum value between \(n_1-1\) and \(n_2-1\).
    Example \(\PageIndex{2}\)

    A STEM education researcher is interested in understanding the differences in weekly study habits between math majors and chemistry majors at a local university. She randomly selects two independent groups of students to participate in her study and collects the data presented in the table below. The units are in hours per week. Use a confidence level of 99% to construct a confidence interval for the difference between the two means. Round the calculation results to three decimal places.

    Math Majors Chemistry Majors
    \(\bar{X}_1\) =10.5 \(\bar{X}_2\) =12.9
    \(s_1\) =3.6 \(s_2\) =2.68
    \(n_1\) =10 \(n_2\) =13
    Solution

    Step 1) This confidence interval requires t-values since the sample standard deviations are given and both sample sizes are less than 30.

    Step 2) Compute the degrees of freedom. It is the minimum value between \(n_1-1 = 10-1=9\) and \(n_2-1=13-1=12\). Thus, the degrees of freedom is 9.

    Step 3) Look up the \(t_C\) value in the table using the confidence level and degrees of freedom. It is \(t_C=3.25\).

    Step 4) Plug information into the formula and calculate using the order of operations.

    \((\bar{X}_1-\bar{X}_2)-t_C\cdot\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}\)\(<\mu_1 - \mu_2<\)\((\bar{X}_1+\bar{X}_2)-t_C\cdot\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}\)

    \((10.5-12.9)-3.25\cdot\sqrt{\dfrac{3.6^2}{10} + \dfrac{2.68^2}{10}}\)\(<\mu_1 - \mu_2<\)\((10.5-12.9)+3.25\cdot\sqrt{\dfrac{3.6^2}{10} + \dfrac{2.68^2}{10}}\)

    \(-2.4-4.613\)\(<\mu_1 - \mu_2<\)\(-2.4+4.613\)

    \(-7.2013\)\(<\mu_1 - \mu_2<\)\(2.213\)

    Since the interval does include zero, this suggests there is no evidence of a difference in study hours between the two groups

    Confidence Interval for the Difference Between Two Proportions

    The formula is presented in the box below and assumes normality if the following inequalities are valid: \(n_1\cdot p_1 \geq 5\) and \(n_2\cdot p_2 \geq 5\).

    Definition: Confidence Interval for the Difference Between Two Proportions

    \(( \hat{p}_1 - \hat{p}_2) - z_C \cdot \sqrt{\dfrac{\hat{p}_1 \cdot \hat{q}_1}{n_1} + \dfrac{\hat{p}_2 \cdot \hat{q}_2}{n_2}} \)\(<p_1-p_2<\)\(( \hat{p}_1 - \hat{p}_2) + z_C \cdot \sqrt{\dfrac{\hat{p}_1 \cdot \hat{q}_1}{n_1} + \dfrac{\hat{p}_2 \cdot \hat{q}_2}{n_2}} \)

    Where,

    • \(\hat{p}_1 =\) sample proportion for sample 1, calculated as \(\hat{p}_1 = \dfrac{X_1}{n_1}\)
    • \(\hat{p}_2 =\) sample proportion for sample 2, calculated as \(\hat{p}_2 = \dfrac{X_2}{n_2}\)
    • \(\hat{q}_1 =\) complement of sample proportion for sample 1, calculated as \(\hat{q}_1 =1 -\hat{p}_1 \)
    • \(\hat{q}_2 =\) complement of sample proportion for sample 2, calculated as \(\hat{q}_2 =1 -\hat{p}_2 \)
    • \(p_1\) = the proportion of population 1.
    • \(p_2\) = the proportion of population 2.
    • \(X_1\) = number of successes in sample 1.
    • \(X_2\) = number of successes in sample 2.
    • \(n_1\) = sample size of population 1.
    • \(n_2\) = sample size of population 2.
    • \(Z_C\) = critical value from the standard normal distribution for a given confidence level.
    Example \(\PageIndex{3}\)

    A community college is interested in comparing the proportions of students who take classes during the winter intercession versus the summer intercession. A survey of students yielded the results listed in the table below. Construct a confidence interval for the difference of the 2 proportions using a confidence level of 98%. Round all calculations to three decimal places.

    Winter Intercession Summer Intercession
    \(X_1 = 98\) \(X_2 = 178\)
    \(n_1 = 300\) \(n_2 = 400\)
    Solution

    Step 1) Compute the sample proportions and their complements.

    \(\hat{p}_1 = \dfrac{X_1}{n_1} = \dfrac{98}{300} = 0.327\)

    \(\hat{q}_1 =1 - 0.327 = 0.673 \)

    \(\hat{p}_2 = \dfrac{X_2}{n_2} = \dfrac{165}{400} = 0.445\)

    \(\hat{q}_2 =1 - 0.413 = 0.555\)

    Step 2) Look up the critical value in the table or a graphing calculator. The value is \(Z_C = 2.33\)

    Step 3) Plug information into the formula and calculate using the order of operations.

    \(( \hat{p}_1 - \hat{p}_2) - z_C \cdot \sqrt{\dfrac{\hat{p}_1 \cdot \hat{q}_1}{n_1} + \dfrac{\hat{p}_2 \cdot \hat{q}_2}{n_2}} \)\(<p_1-p_2<\)\(( \hat{p}_1 - \hat{p}_2) + z_C \cdot \sqrt{\dfrac{\hat{p}_1 \cdot \hat{q}_1}{n_1} + \dfrac{\hat{p}_2 \cdot \hat{q}_2}{n_2}} \)

    \(( 0.327 - 0.445) - 2.33 \cdot \sqrt{\dfrac{0.327 \cdot 0.673}{300} + \dfrac{0.445 \cdot 0.555}{400}} \)\(<p_1-p_2<\)\(( 0.327 - 0.445) + 2.33 \cdot \sqrt{\dfrac{0.327 \cdot 0.673}{300} + \dfrac{0.445 \cdot 0.555}{400}} \)

    \(-0.118-0.086\) \(<p_1-p_2<\) \(-0.118+0.086\)

    \(-0.204\) \(<p_1-p_2<\) \(-0.032\)

    Since the interval includes 0, there is no statistically significant evidence of a difference in the proportions of students taking classes during these two intercessions at the 98% confidence level. ​

    The TI-84+ can be used to compute the confidence interval as well.

    Step 1) Click on the "stat" button and use the right arrow to select "TESTS." Use the down arrow to select option B. This option has the title "2-PopZint."

    Step 2) Under "2-PropZint," select "Stats." Type in the given data and select "Calculate."

    Step 3) The output is at the top of the screen. Round the two answers to three decimal places. They are written in parentheses and are separated by a comma. After rounding the confidence interval is (–0.204, –0.033). The right endpoint slightly differs from the answer computed using the formula. The reason is that the calculator did not round intermediate steps as done when using the formula.

    The image below captures all three steps.

    fig-ch01_patchfile_01.jpg
    Figure \(\PageIndex{1}\): Copy and Paste Caption here. (Copyright; author via source)

    This page titled 9.4: Confidence Intervals for the Difference Between Two Means and Proportions is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Toros Berberyan, Tracy Nguyen, and Alfie Swan.

    • Was this article helpful?