Skip to main content
Statistics LibreTexts

11.8: Non-Parametric Analysis Between Multiple Groups

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    When you have ranked data, or you think that the distribution is not normally distributed, then you use a non-parametric analysis.  When you have three or more independent groups, the Kruskal-Wallis test is the one to use! The test statistic letter for the Kruskal-Wallis is \(H\), like the test statistic letter for a Student t-test is \(t\) and ANOVAs is F.

    More on When to Use the Kruskal-Wallis:

    Some people have the attitude that unless you have a large sample size and can clearly demonstrate that your data are normal, you should routinely use Kruskal–Wallis; they think it is dangerous to use a Between Groups ANOVA, which assumes normality, when you don't know for sure that your data are normal. However, a Between Groups ANOVA is generally robust (not very sensitive not meeting its assumptions, like deviations from normality). Dr. McDonald has done simulations with a variety of non-normal distributions, including flat, highly peaked, highly skewed, and bimodal, and the proportion of false positives is always around \(5\%\) or a little lower, just as it should be. For this reason, he doesn't recommend the Kruskal-Wallis test as an alternative to a Between Groups ANOVAs. However, because many people use the Kruskal-Wallis, you should be familiar with it even if Dr. McDonald convince you that it's overused.

    The Kruskal-Wallis test is a non-parametric test, which also means that it does not assume that the data come from a distribution that can be completely described by two parameters, mean and standard deviation (the way a normal distribution can). Like most non-parametric tests, you perform it on ranked data, so you convert the measurement observations to their ranks in the overall data set: the smallest value gets a rank of \(1\), the next smallest gets a rank of \(2\), and so on. You lose information when you substitute ranks for the original values, which can make this a somewhat less powerful test than a a Between Groups ANOVA; this is another reason to prefer a a Between Groups ANOVA.

    The other assumption of a Between Groups ANOVA is that the variation within the groups is equal (homoscedasticity). While Kruskal-Wallis does not assume that the data are normal, it does assume that the different groups have the same distribution, and groups with different standard deviations have different distributions. If your data are heteroscedastic, Kruskal–Wallis is no better than a Between Groups ANOVA, and may be worse. Instead, you should use Welch's ANOVA for heteoscedastic data (which is not discussed in this textbook).

    The only time I recommend using Kruskal-Wallis is when your original data set actually consists of one nominal variable and one ranked variable.


    Research Hypotheses

    Like a Between Groups ANOVA, the Kruskal-Wallis will have three or more groups, so the research hypothesis will describe how all of these groups relate by predicting their mean ranks.  

    Null Hypothesis

    The null hypothesis of the Kruskal–Wallis test is that the mean ranks of the groups are the same. 

    Critical Value

    \(H\) is approximately chi-square distributed.  We have not discussed chi-square, but we will!  The critical values of chi-square table can be found on a future page or through the Common Critical Values page at the end of this textbook.  The degrees of freedom is the number of groups minus 1 (k - 1).

    Compute the Test Statistic

    Here are some data on Wright's \(F_{ST}\) (a measure of the amount of geographic variation in a genetic polymorphism) in two populations of the American oyster, Crassostrea virginica. McDonald et al. (1996) collected data on \(F_{ST}\) for six anonymous DNA polymorphisms (variation in random bits of DNA of no known function) and compared the \(F_{ST}\) values of the six DNA polymorphisms to \(F_{ST}\) values on \(13\) proteins from Buroker (1983). The biological question was whether protein polymorphisms would have generally lower or higher \(F_{ST}\) values than anonymous DNA polymorphisms. McDonald et al. (1996) knew that the theoretical distribution of \(F_{ST}\) for two populations is highly skewed, so they analyzed the data with a Kruskal–Wallis test.

    When working with a measurement variable, the Kruskal–Wallis test starts by substituting the rank in the overall data set for each measurement value. The smallest value gets a rank of \(1\), the second-smallest gets a rank of \(2\), etc. Tied observations get average ranks; in this data set, the two \(F_{ST}\) values of \(-0.005\) are tied for second and third, so they get a rank of \(2.5\).

    Table \(\PageIndex{1}\)- Oyster Data and Ranks
    gene class FST Rank (DNA) Rank (Protein)
    CVJ5 DNA -0.006 1  
    CVB1 DNA -0.005 2.5  
    6Pgd protein -0.005   2.5
    Pgi protein -0.002   4
    CVL3 DNA 0.003 5  
    Est-3 protein 0.004   6
    Lap-2 protein 0.006   7
    Pgm-1 protein 0.015   8
    Aat-2 protein 0.016   9.5
    Adk-1 protein 0.016   9.5
    Sdh protein 0.024   11
    Acp-3 protein 0.041   12
    Pgm-2 protein 0.044   13
    Lap-1 protein 0.049   14
    CVL1 DNA 0.053 15  
    Mpi-2 protein 0.058   16
    Ap-1 protein 0.066   17
    CVJ6 DNA 0.095 18  
    CVB2m DNA 0.116 19  
    Est-1 protein 0.163   20

    To use the following formula, there is one new notation:  Rgroup.  This is the sum of the ranks for that group.  Otherwise, you've seen all of this before!

    \[H = \left[\left(\dfrac{12}{(N * (N = 1)} \right) * \left( \sum{\dfrac{R_{group}^2}{n_{group}}} \right) \right] - \left( 3 * (N = 1) \right) \nonumber \]

    You calculate the sum of the ranks for each group (Rgroup), then the test statistic, \(H\). For the example data, You add the 6 ranks for the first group (\(\sum = 60.5 \)) and the 14 ranks for the second group (\(\sum = 149.5 \)), then follow the rest of the formula to get to \(H\) = 0.04.

    Make the Decision

    The Kruskall-Wallis uses the Table of Critical Values of Chi-Square, which can be found on a page in the Chi-Square chapter, or you can find the link in the Common Tables of Critical Values page at the end of this textbook.  To use this table, the Degrees of Freedom for Kruskal-Wallis is k-1 (\(df = k - 1\)) in which k is the number of groups.  

    In our example above, we have two groups (protein or DNA), so the Degrees of Freedom would be 2 (\(df = k - 1 = 3 - 1 = 2\)).  The critical value for an probability of 0.05 for \(H(2)\) = 5.991, so we would retain the null hypothesis and say that there is not a difference in mean ranks for these two groups.  

    If you do find a significant difference in the ranks and you have more than two groups, you should run a Mann-Whitney U (discussed previously in the chapter on independent samples t-test analysis) as a pairwise comparison of each pair of ranks. 


    The Kruskal–Wallis test does NOT assume that the data are normally distributed; that is its big advantage. If you're using it to test whether the medians are different, it does assume that the observations in each group come from populations with the same shape of distribution, so if different groups have different shapes (one is skewed to the right and another is skewed to the left, for example, or they have different variances), the Kruskal–Wallis test may give inaccurate results (Fagerland and Sandvik 2009). If you're interested in any difference among the groups that would make the mean ranks be different, then the Kruskal–Wallis test doesn't make any assumptions.

    Heteroscedasticity is one way in which different groups can have different shaped distributions. If the distributions are heteroscedastic, the Kruskal–Wallis test won't help you; instead, you should use Welch's t–test for two groups, or Welch's ANOVA for more than two groups; both of these are beyond the scope of this textbook but many statistical software packages will run these analyses.


    Fagerland, M.W., and L. Sandvik. 2009. The Wilcoxon-Mann-Whitney test under scrutiny. Statistics in Medicine 28: 1487-1497.

    McDonald, J.H., B.C. Verrelli and L.B. Geyer. 1996. Lack of geographic variation in anonymous nuclear polymorphisms in the American oyster, Crassostrea virginica. Molecular Biology and Evolution 13: 1114-1118.


    11.8: Non-Parametric Analysis Between Multiple Groups is shared under a CC BY license and was authored, remixed, and/or curated by Michelle Oja.