# 9.5.1: Non-Parametric Independent Sample t-Test

- Page ID
- 22095

The alternatives to all statistical analyses comparing means are non-parametric analyses. A parameter is a statistic that describes the population. Non-parametric statistics don’t require the population data to be normally distributed.

If the data are not normally distributed, then we can’t compare means because there is no center! Non-normal distributions may occur when there are:

- Few people (small N)
- Extreme scores (outliers)
- There’s an arbitrary cut-off point on the scale. (Like if a survey asked for ages, but then just said, “17 and below”.)

All of the non-parametric statistics for use with quantitative variables (means) work with the *ranks *of the variables, rather than the values themselves.

From Ch. 1, what is the scale of measurement called for ranked variables?

**Answer**-
Ordinal scale of measurement

If your data started out as a quantative variable but you need to convert it to rank for these analyses, you give the smallest score of either of your groups the smallest rank, making each higher score a bigger rank. The tied values get the mean of the involved ranks. For example, if April’s score and Primavera’s score are tied for 3^{rd} and 4^{th} ranks, both get a rank of 3.5.

## Mann-Whitney U Test

The Mann-Whitney U-test is a non-parametric alternative to an independent samples \(t\)*-*test that some people recommend for non-normal data. An independent samples \(t\)*-*test can usually handle if the standard deviations are similar or are not normally distributed, so there's little reason to use the Mann-Whitney U-test unless you have a true ranked variable instead of a quantitative variable..

Despite that fact, this is a behavioral statistics textbook, so we’re going to talk about statistical alternative.

You can use the Mann-Whitney when:

- Your data is already in ranks (ordinal),
*or* - When you’d like to use an independent sample t-test, but the data is probably not normally distributed. (When the data is not normally distributed, the mean is sorta… meaningless.)

### Formula

The formulas are below, but they are so uncommonly used that they won’t be in the Common Formulas page at the back of the textbook. Similarly, there is a critical value table for U-scores, but that will also not be included in the Common Critical Values page.

To calculate this formula, you would need to list out all of the scores in order, and identify which is from which group. Then, you would calculate R1, which is the sum of the *ranks* of all of the scores (not the scores themselves) from the first group.

\[ U_1 = (N_1 * N_2) + \left( \dfrac{(N_1 * (N_1 + 1))}{2} \right) – R_1 \nonumber \]

\[ U_2 = (N_1 * N_2) – U_1 \nonumber \]

### Mann-Whitney steps:

- Calculate the
*two*formulas, - Then compare the
*smallest*of the two calculated U values to a critical U from a critical U table.

### Interpreting Results

Imagine that I wanted to compare the mean of Exam #1 of two sections of my behavioral statistics classes, one in the morning and one in the evening.

- Research Hypothesis: The morning class’s average Exam #1 score will be higher than the average Exam #1 score of the evening section.
- Symbols: \( \bar{X_M} > \bar{X_E} \)

What’s the null hypothesis in words and symbols?

**Answer**-
- Null Hypothesis: The morning class’s average Exam #1 score will be similar ton the average Exam #1 score of the evening section.
- Symbols: \( \bar{X_M} = \bar{X_E} \)

I don’t think that the data is normally distributed, so I would run a Mann-Whitney U. If I got results like these: (U(62) = 1.11, p>.05). After looking at the actual ranks, I would conclude that the morning class’s Exam #1 ranks were higher than the ranks of the evening class on Exam #1.

And that's it!

## Contributors and Attributions

John H. McDonald (University of Delaware)