10.4: Effect Size
-
- Last updated
- Save as PDF
As we discussed earlier (Section 11.8), it’s becoming commonplace to ask researchers to report some measure of effect size. So, let’s suppose that you’ve run your chi-square test, which turns out to be significant. So you now know that there is some association between your variables (independence test) or some deviation from the specified probabilities (goodness of fit test). Now you want to report a measure of effect size. That is, given that there is an association/deviation, how strong is it?
There are several different measures that you can choose to report, and several different tools that you can use to calculate them. I won’t discuss all of them, 179 but will instead focus on the most commonly reported measures of effect size.
By default, the two measures that people tend to report most frequently are the ϕ statistic and the somewhat superior version, known as Cram'er’s V. Mathematically, they’re very simple. To calculate the ϕ statistic, you just divide your X 2 value by the sample size, and take the square root:
\(\phi=\sqrt{\dfrac{X^{2}}{N}}\)
The idea here is that the ϕ statistic is supposed to range between 0 (no at all association) and 1 (perfect association), but it doesn’t always do this when your contingency table is bigger than 2×2, which is a total pain. For bigger tables it’s actually possible to obtain ϕ>1, which is pretty unsatisfactory. So, to correct for this, people usually prefer to report the V statistic proposed by Cramér (1946). It’s a pretty simple adjustment to ϕ. If you’ve got a contingency table with r rows and c columns, then define k=min(r,c) to be the smaller of the two values. If so, then Cram'er’s V statistic is
\(V=\sqrt{\dfrac{X^{2}}{N(k-1)}}\)
And you’re done. This seems to be a fairly popular measure, presumably because it’s easy to calculate, and it gives answers that aren’t completely silly: you know that V really does range from 0 (no at all association) to 1 (perfect association).
Calculating V or ϕ is obviously pretty straightforward. So much so that the core packages in R don’t seem to have functions to do it, though other packages do. To save you the time and effort of finding one, I’ve included one in the
lsr
package, called
cramersV()
. It takes a contingency table as input, and prints out the measure of effect size:
cramersV( chapekFrequencies )
## [1] 0.244058
However, if you’re using the
associationTest()
function to do your analysis, then you won’t actually need to use this at all, because it reports the Cram'er’s V statistic as part of the output.