11.7: Effect Sizes and Confidence Intervals

Last updated
Save as PDF

Page ID: 14520

Foster et al.
University of Missouri-St. Louis, Rice University, & University of Houston, Downtown Campus via University of Missouri’s Affordable and Open Access Educational Resources Initiative

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We have seen in previous chapters that even a statistically significant effect needs to be interpreted along with an effect size to see if it is practically meaningful. We have also seen that our sample means, as a point estimate, are not perfect and would be better represented by a range of values that we call a confidence interval. As with all other topics, this is also true of our independent samples \(t\)-tests.

Our effect size for the independent samples \(t\)-test is still Cohen’s \(d\), and it is still just our observed effect divided by the standard deviation. Remember that standard deviation is just the square root of the variance, and because we work with pooled variance in our test statistic, we will use the square root of the pooled variance as our denominator in the formula for Cohen’s \(d\). This gives us:

\[d=\dfrac{M_{1}-M_{2}}{\sqrt{s_{p}^{2}}} \]

For our example above, we can calculate the effect size to be:

\[d=\dfrac{24.00-16.50}{\sqrt{144.48}}=\dfrac{7.50}{12.02}=0.62 \nonumber \]

We interpret this using the same guidelines as before, so we would consider this a moderate or moderately large effect.

Our confidence intervals also take on the same form and interpretation as they have in the past. The value we are interested in is the difference between the two means, so our point estimate is the value of one mean minus the other, or M₁ minus M₂. Just like before, this is our observed effect and is the same value as the one we place in the numerator of our test statistic. We calculate this value then place the margin of error – still our critical value times our standard error – above and below it. That is:

\[\text { Confidence Interval }=(M_{1}-M_{2}) \pm t^{*}\left(s_{M_{1}-M_{2}}\right) \]

Because our hypothesis testing example used a one-tailed test, it would be inappropriate to calculate a confidence interval on those data (remember that we can only calculate a confidence interval for a two-tailed test because the interval extends in both directions). Let’s say we find summary statistics on the average life satisfaction of people from two different towns and want to create a confidence interval to see if the difference between the two might actually be zero.

Our sample data are \(M_{1}=28.65\; \mathrm{s}_{1}=12.40\; \mathrm{n}_{1}=40\) and \(M_{2}=25.40 \mathrm{s}_{2}=15.68 \mathrm{n}_{2}=42\). At face value, it looks like the people from the first town have higher life satisfaction (28.65 vs. 25.40), but it will take a confidence interval (or complete hypothesis testing process) to see if that is true or just due to random chance. First, we want to calculate the difference between our sample means, which is 28.65 – 25.40 = 3.25. Next, we need a critical value from our \(t\)-table. If we want to test at the normal 95% level of confidence, then our sample sizes will yield degrees of freedom equal to 40 + 42 – 2 = 80. From our table, that gives us a critical value of \(t*\) = 1.990. Finally, we need our standard error. Recall that our standard error for an independent samples \(t\)-test uses pooled variance, which requires the Sum of Squares and degrees of freedom. Up to this point, we have calculated the Sum of Squares using raw data, but in this situation, we do not have access to it. So, what are we to do?

If we have summary data like standard deviation and sample size, it is very easy to calculate the pooled variance, and the key lies in rearranging the formulas to work backwards through them. We need the Sum of Squares and degrees of freedom to calculate our pooled variance. Degrees of freedom is very simple: we just take the sample size minus 1.00 for each group. Getting the Sum of Squares is also easy: remember that variance is standard deviation squared and is the Sum of Squares divided by the degrees of freedom. That is:

\[s^{2}=(s)^{2}=\dfrac{S S}{d f} \]

To get the Sum of Squares, we just multiply both sides of the above equation to get:

\[s^{2} * d f=S S \]

Which is the squared standard deviation multiplied by the degrees of freedom (\(n-1\)) equals the Sum of Squares.

Using our example data:

\[\begin{array}{c}{\left(s_{1}\right)^{2} * d f_{1}=S S_{1}} \\ {(12.40)^{2} *(40-1)=5996.64}\end{array} \nonumber \]

\[\begin{array}{c}{\left(s_{2}\right)^{2} * d f_{2}=S S_{2}} \\ {(15.68)^{2} *(42-1)=10080.36}\end{array} \nonumber \]

And thus our pooled variance equals:

\[s_{p}^{2}=\dfrac{S S_{1}+S S_{2}}{d f_{1}+d f_{2}}=\dfrac{5996.64+10080.36}{39+41}=\dfrac{16077}{80}=200.96 \nonumber \]

And our standard error equals:

\[s_{M_{1}-M_{2}}=\sqrt{\frac{s_{p}^{2}}{n_{1}}+\frac{s_{p}^{2}}{n_{2}}}=\sqrt{\frac{200.96}{40}+\frac{200.96}{42}}=\sqrt{5.02+4.78}=\sqrt{9.89}=3.14 \nonumber \]

All of these steps are just slightly different ways of using the same formulae, numbers, and ideas we have worked with up to this point. Once we get out standard error, it’s time to build our confidence interval.

\[95 \% C I=3.25 \pm 1.990(3.14) \nonumber \]

\[\begin{aligned} \text {Upper Bound} &=3.25+1.990(3.14) \\ U B &=3.25+6.25 \\ U B &=9.50 \end{aligned} \nonumber \]

\[\begin{array}{l}{\text { Lower Bound }=3.25-1.990(3.14)} \\ {\qquad \begin{aligned} L B=& 3.25-6.25 \\ L B &=-3.00 \end{aligned}}\end{array} \nonumber \]

\[95 \% C I=(-3.00,9.50) \nonumber \]

Our confidence interval, as always, represents a range of values that would be considered reasonable or plausible based on our observed data. In this instance, our interval (-3.00, 9.50) does contain zero. Thus, even though the means look a little bit different, it may very well be the case that the life satisfaction in both of these towns is the same. Proving otherwise would require more data.

Contributors and Attributions

Foster et al. (University of Missouri-St. Louis, Rice University, & University of Houston, Downtown Campus)