Skip to main content
Statistics LibreTexts

13.4: Wilcoxon Signed-Rank Test

  • Page ID
    24083
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    The Wilcoxon Signed-Rank Sum test is the non-parametric alternative to the dependent t-test. The Wilcoxon Signed-Rank Sum test compares the medians of two dependent distributions. The Signed-Rank Sum test, developed by Frank Wilcoxon, finds the difference between paired data values and ranks the absolute value of the differences. Then we sum the ranks for all the negative and positive differences separately. The absolute value of the smaller of these summed ranks is called \(w_{s}\). If there were any differences of zero you would not count them in your sample size.

    Black-and-white portrait photograph of Frank Wilcoxon.
    Frank Wilcoxon
    Small Sample Size Case: \(n < 30\)

    When the sample size is less than 30, the test statistic is \(w_{s}\), the absolute value of the smaller of the sum of ranks. Figure 13-5 provides critical values for the Wilcoxon Signed-Rank test. If the test statistic \(w_{s}\) is greater than the critical value from the table, we fail to reject \(H_{0}\). If the test statistic \(w_{s}\) is less than or equal to the critical value from the table, we reject \(H_{0}\).

    Figure 13-5: Critical Values for the Signed Rank Test. Dashes indicate that the sample is too small to reject \(H_{0}\).
      1-Tailed \(\alpha\)   2-Tailed \(\alpha\)
    \(n\) 0.01 0.05 0.10   0.01 0.05 0.10
    5 - 0 2   - - 0
    6 - 2 3   - 0 2
    7 0 3 5   - 2 3
    8 1 5 8   0 3 5
    9 3 8 10   1 5 8
    10 5 10 14   3 8 10
    11 7 13 17   5 10 13
    12 9 17 21   7 13 17
    13 12 21 26   9 17 21
    14 15 25 31   12 21 25
    15 19 30 36   15 25 30
    16 23 35 42   19 29 35
    17 27 41 48   23 34 41
    18 32 47 55   27 40 47
    19 37 53 62   32 46 53
    20 43 60 69   37 52 60
    21 49 67 77   42 58 67
    22 55 75 86   48 65 75
    23 62 83 94   54 73 83
    24 69 91 104   61 81 91
    25 76 100 113   68 89 100
    26 84 110 124   75 98 110
    27 92 119 134   83 107 119
    28 101 130 145   91 116 130
    29 110 140 157   100 126 140
    Example \(\PageIndex{1}\)

    In an effort to increase production of an automobile part, the factory manager decides to play music in the manufacturing area. Eight workers are selected, and the number of items each produced for a specific day is recorded. After one week of music, the same workers are monitored again. The data are given in the table. At \(\alpha = 0.05\), can the manager conclude that listening to music has increased production? Use the Wilcoxon Signed-Rank Test since there is no mention of the population being normally distributed.

    Worker 1 2 3 4 5 6 7 8 9

    Before 6 8 10 9 5 12 9 5 7

    After 10 12 9 12 8 13 8 5 10

    Solution

    The correct hypotheses are:

    \(H_{0}\): Music in the manufacturing area does not increase production.
    \(H_{1}\): Music in the manufacturing area increases production.

    This is a left-tailed test.

    In order to compute the t-test statistic, first compute the differences between each of the matched pairs.

    Before \((x_{1})\) 6 8 10 9 5 12 9 5 7
    After \((x_{2})\) 10 12 9 12 8 13 8 5 10
    \(D = x_{1} - x_{2}\) –4 –4 1 –3 –3 –1 1 0 –3

    Take the absolute value of each difference.

    Before \((x_{1})\) 6 8 10 9 5 12 9 5 7
    After \((x_{2})\) 10 12 9 12 8 13 8 5 10
    \(D = x_{1} - x_{2}\) –4 –4 1 –3 –3 –1 1 0 –3
    \(|D|\) 4 4 1 3 3 1 1 0 3

    Rank the data and drop any ties. At this point, if any of the differences are zero, that pair is no longer used and is not ranked.

    Before \((x_{1})\) 6 8 10 9 5 12 9 5 7
    After \((x_{2})\) 10 12 9 12 8 13 8 5 10
    \(D = x_{1} - x_{2}\) –4 –4 1 –3 –3 –1 1 0 –3
    \(|D|\) 4 4 1 3 3 1 1 0 3
    Rank 7.5 7.5 2 5 5 2 2 drop 5

    The sample size \(n\) is the number of differences that are not zero. So, in this case, \(n = 8\). Next, take the sign of the difference and attach this plus or minus sign to each rank.

    Before \((x_{1})\) 6 8 10 9 5 12 9 7
    After \((x_{2})\) 10 12 9 12 8 13 8 10
    \(D = x_{1} - x_{2}\) –4 –4 1 –3 –3 –1 1 –3
    \(|D|\) 4 4 1 3 3 1 1 3
    Rank 7.5 7.5 2 5 5 2 2 5
    Signed Rank –7.5 –7.5 +2 –5 –5 -2 +2 –5

    Find the sum of the positive and negative ranks:

    Positive ranks: \(2 + 2 = 4\)

    Negative ranks: \((-7.5) + (-7.5) + (-5) + (-5) + (-2) + (-5) = -32\).

    Take the smaller of the absolute value of the sums of the ranks: \(|4| = 4, |-32| = 32\), so 4 is smaller.

    This is our test statistic called \(w_{s} = 4\).

    Next, use the table in Figure 13-5 to get the critical value. The table provides critical values for two-tailed tests.

    This is a one-tailed test, with \(\alpha = 0.05\) and \(n = 8\). See Figure 13-6 that shows which row and column from Figure 13- 5 to use to find the critical value.

    Figure 13-6: Critical value for 1-tailed test with \(\alpha = 0.05\) and \(n=8\).
        1-Tailed \(\alpha\)  
    \(n\) 0.01 0.05 0.10
    5 - 0 2
    6 - 2 3
    7 0 3 5
    8 1 5 8

    The critical value = 5. The test statistic \(w_{s} = 4\) is less than the critical value of 5.

    The decision rule for the critical values in Figure 13-5 is to reject the null if the test statistic is less than or equal to the critical value, and do not reject the null hypothesis if the test statistic is larger than the critical value.

    Since \(w_{s} < \text{Critical Value} = 5\), the decision is to reject \(H_{0}\). There is enough evidence to support the claim that listening to music has increased production.

    When the sample size is 30 or more,a paired t-test may be used in most situations. However, if your population is heavily skewed or you are using interval data, then use the large sample size normal approximation Wilcoxon Signed-Rank test.

    Large Sample Size Case: \(n \geq 30\)

    We can use the normal approximation for sample sizes of 30 or more. The formula for the test statistic is: \[z = \frac{\left(w_{s} - \left(\dfrac{n (n+1)}{4}\right) \right)}{\sqrt{\left( \dfrac{n(n+1)(2n+1)}{24} \right)}} \nonumber\]

    where \(n\) is the reduced samples size excluding any differences of zero, and \(w_{s}\) is the smaller in absolute value of the signed ranks for a two-tailed test, the sum of the positive ranks for a left-tailed test, or the sum of the negative ranks for a right-tailed test. The sample size \(n\) is the reduced sample size not including any differences of zero.

    Example \(\PageIndex{2}\)

    A pharmaceutical company is testing to see if there is a significant difference in the pain relief for two new pain medications. They randomly assign the two different pain medications for 34 patients with chronic pain and record the pain rating for each patient one hour after each dose. The pain ratings are on a sliding scale from 1 to 10. The results are listed below. Use the Wilcoxon Signed-Rank test to see if there is a significant difference at \(\alpha = 0.05\).

    Patient Drug 1 Drug 2 Patient Drug 1 Drug 2 Patient Drug 1 Drug 2 1 2.4 2.5   13 4 6.1   25 4 5.1 2 4.7 3.3   14 2.2 2.9   26 5.5 4.4 3 1.2 5.3   15 2.7 4.3   27 3.6 3.6 4 5.9 5.6   16 2.9 3.3   28 3.8 3.5 5 4.5 5   17 5 5   29 5.4 4.8 6 4 5.3   18 3.1 5.1   30 2.4 3.2 7 2.5 4.6   19 3.3 3.3   31 4.1 2.6 8 3 2.5   20 3 5.9   32 4.5 5.7 9 5 3.4   21 5.4 3.2   33 4 5.8 10 5.8 5.4   22 4.2 5.9   34 6 5 11 1.9 5.1   23 3.6 5.9         12 3.2 4.3   24 2.2 5.6        Solution
    Solution

    The correct hypotheses are:

    \(H_{0}\): There is no difference in the pain scale rating for the two pain medications.
    \(H_{1}\): There is a difference in the pain scale rating for the two pain medications.

    Compute the differences between each of the matched pairs. Rank the absolute value of the differences. Make sure to average the ranks repeated differences and do not rank any differences of zero. After the differences are ranked, attach this sign of the difference to each rank.

    Patient Drug 1 Drug 2 Difference \(|D|\) Rank Signed Rank
    1 2.4 2.5 –0.1 0.1 1 –1
    2 4.7 3.3 1.4 1.4 17 17
    3 1.2 5.3 –4.1 4.1 31 –31
    4 5.9 5.6 0.3 0.3 2.5 2.5
    5 4.5 5 –0.5 0.5 6.5 –6.5
    6 4 5.3 –1.3 1.3 16 –16
    7 2.5 4.6 –2.1 2.1 24.5 –24.5
    8 3 2.5 0.5 0.5 6.5 6.5
    9 5 3.4 1.6 1.6 19.5 19.5
    10 5.8 5.4 0.4 0.4 4.5 4.5
    11 1.9 5.1 –3.2 3.2 29 –29
    12 3.2 4.3 –1.1 1.1 13 –13
    13 4 6.1 –2.1 2.1 24.5 –24.5
    14 2.2 2.9 –0.7 0.7 9 –9
    15 2.7 4.3 –1.6 1.6 19.5 –19.5
    16 2.9 3.3 –0.4 0.4 4.5 –4.5
    17 5 5 0      
    18 3.1 5.1 –2 2 23 –23
    19 3.3 3.3 0      
    20 3 5.9 –2.9 2.9 28 –28
    21 5.4 3.2 2.2 2.2 26 26
    22 4.2 5.9 –1.7 1.7 21 –21
    23 3.6 5.9 –2.3 2.3 27 –27
    24 2.2 5.6 –3.4 3.4 30 –30
    25 4 5.1 –1.1 1.1 13 – 13
    26 5.5 4.4 1.1 1.1 13 13
    27 3.6 3.6 0      
    28 3.8 3.5 0.3 0.3 2.5 2.5
    29 5.4 4.8 0.6 0.6 8 8
    30 2.4 3.2 –0.8 0.8 10 –10
    31 4.1 2.6 1.5 1.5 18 18
    32 4.5 5.7 –1.2 1.2 15 –15
    33 4 5.8 –1.8 1.8 22 –22
    34 6 5 1 1 11 11

    Find the sum of the positive and negative ranks:

    Positive ranks: \(17 + 2.5 + 6.5 + 19.5 + 4.5 + 26 + 13 + 2.5 + 8 + 18 + 11 = 128.5\)

    Negative ranks: \((-1) + (-31) + (-6.5) + (-16) + (-24.5) + (-29) + (-13) + (-24.5) + (-9) + (-19.5) + (-4.5) + (-23) + (-28) + (-21) + (-27) + (-30) + (-13) + (-10) + (-15) + (-22) = -367.5\).

    Take the smaller of the absolute value of the sums of the ranks: \(|128.5| = 128.5, |-367.5| = 367.5\), so 128.5 is smaller.

    The smaller of the absolute value of the sum of the ranks is \(w_{s} = 128.5\).

    Throw out the three differences of zero. The sample size is \(n = 31\).

    The test statistic is: \(z = \frac{\left(w_{s} - \left(\frac{n(n+1)}{4}\right)\right)}{\sqrt{\left( \frac{n (n+1) (2n+1)}{24}\right) }} = \frac{\left(128.5 - \left( \frac{31 \cdot 32}{4}\right) \right)}{\sqrt{\left( \frac{31 \cdot 32 \cdot 63}{24}\right) }} = \frac{(128.5 - 248)}{\sqrt{(2604)}} = -2.341787\)

    Either the \(z\) critical value or p-value method may be used, similar to how we used previous z-tests.

    Compute the \(z_{\alpha/2}\) critical values. Draw and label the distribution; see Figure 13-7.

    z_alpha/2 critical values of positive and negative 1.96.
    Figure 13-7: \(z_{\alpha/2}\) critical values.

    Use the inverse normal function \(\text{invNorm}(0.025,0,1)\) to get \(z_{\alpha/2} = \pm 1.96\). The test statistic \(z = -2.3418\) is in the shaded critical region, so reject \(H_{0}\).

    There is enough evidence to support the claim that there is a significant difference in the pain scale rating for the two pain medications.

    These calculations can be done by hand or using the following online calculator: http://www.socscistatistics.com/tests/signedranks.

    The TI calculators and Excel do not have built-in nonparametric tests.

    It is an important and popular fact that things are not always what they seem. For instance, on the planet Earth, man had always assumed that he was more intelligent than dolphins because he had achieved so much – the wheel, New York, wars and so on – whilst all the dolphins had ever done was muck about in the water having a good time. But conversely, the dolphins had always believed that they were far more intelligent than man – for precisely the same reasons.

    (Adams, 2002)


    This page titled 13.4: Wilcoxon Signed-Rank Test is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Rachel Webb via source content that was edited to the style and standards of the LibreTexts platform.