Nonparametric test: Mann-Whitney U test

Mann-Whitney U test also known as Wilcoxon rank sum test. This test is generally used to compare two independent groups when the dependent variable is in ordinal or continuous scale, obviously not follow normal distribution. It often considered as nonparametric alternative of independent t-test but not always the case. So this test has less power than parametric test.
Example: Satisfaction level (ordinal) and gender, Income (Continuous) and education level. 

  • Compare the medians of manufacturing times (Y = continuous) of two different  products (X).
  • Compare the medians of the number of child mortality per month (Y = discrete count) at two different sites (X).
It is always better to check assumptions before go for the test.
Assumptions:

1. Dependent variable should be measured in Ordinal (e.g. Likert scale) or continuous level.

2. Independent groups should consists of two categoricalindependent groups. Example independent variables that meet this criterion include gender (2 groups: male or female), employment status (2 groups: employed or unemployed), smoker (2 groups: yes or no), and so forth.

3. Should have independence of observations, which means that there is no relationship between the observations in each group or between the groups themselves. For example, there must be different participants in each group with no participant being in more than one group. This is more of a study design issue than something anyone can test for, but it is an important assumption of the Mann-Whitney U test. If any study fails this assumption, will need to use another statistical test instead of the Mann-Whitney U test (e.g., a Wilcoxon signed-rank test).

4. A Mann-Whitney U test can be used when your two variables are not normally distributed. However, in order to know how to interpret the results from a Mann-Whitney U test, you have to determine whether your two distributions (i.e., the distribution of scores for both groups of the independent variable; for example, 'males' and 'females' for the independent variable, 'gender') have the same shape. To explain the same shape idea check the link .

Q. What we test in Mann-Whitney U?
Ans. Medians of the two variables.

Hypothesis:
H0 : There is no difference between X and Y
H1 : There is a difference between X and Y
(If you wish to be mathematically correct, you would use H0 : median1 = median2; H1 : median1 ≠ median2 ) We will reject the null hypothesis if the value we calculate (the test statistic) is below the value from the tables (the critical value)

Kruskal-Wallis test  is nonparametric method designed to detect whether 2 or more samples come from the same distribution
Code:
R code :
# Independent  2-group Mann-Whitney U test
wilcox.test(Y~A)
#where Y is a numeric and A is binary
# Independent  2-group Mann-Whitney U test
wilcox.test(Y,X)
#where Y & X are numeric
# dependent 2-group Wilcoxon Signed Rank test
Wilcoxon.test(Y1,Y2, paired=True)

#where Y1 & Y2 are numeric

Unpaired two sample test in R

SPSS procedure

Stata code:
ranksum var1, by(var2)

var1= Continuous/Ordinal variable
var2= Categorical variable (should be binary) 
An interesting Stata user article would be "What hypotheses do “nonparametric”two-group tests actually test?"

Data report:

In our analysis report we provide median and Inter Quartile Range (IQR) instead of mean ± SD (Inappropriate practise).

Do non-parametric tests compare medians?
It is a commonly held belief that a Mann-Whitney U test is in fact a test for differences in medians. However, two groups could have the same median and yet have a significant Mann-Whitney U test. Consider the following data for two groups, each with 100 observations. Group 1: 98 (0), 1, 2; Group 2: 51 (0), 1, 48 (2). The median in both cases is 0, but from the Mann-Whitney test P<0.0001. Only if we are prepared to make the additional assumption that the difference in the two groups is simply a shift in location (that is, the distribution of the data in one group is simply shifted by a fixed amount from the other) can we say that the test is a test of the difference in medians. However, if the groups have the same distribution, then a shift in location will move medians and means by the same amount and so the difference in medians is the same as the difference in means. Thus the Mann-Whitney U test is also a test for the difference in means. How is the Mann- Whitney U test related to the t-test? If one were to input the ranks of the data rather than the data themselves into a two sample t-test program, the P value obtained would be very close to that produced by a Mann-Whitney U test.
Other misuses relate to the problems of small samples and tied data. There is an exact test for small samples, but this is only valid if there are few or no ties within or between groups. The test is sometimes applied to heavily tied data which makes the test too liberal in reporting differences. We also find examples where use of the normal approximation is borderline for the sample sizes used. A confidence interval is sometimes attached to the median difference, but this is rarely done except in medical research. This is a pity, because estimation of magnitude of the treatment effect should be a primary component of any statistical analysis.
We give a few examples of another test, the median test, although it is now rarely used. This is a pity because it is less susceptible to differences in distributions, and hence more readily interpretable in terms of differences between medians. Surprisingly, the few examples we have included make the rather obvious error of reporting arithmetic means and standard errors. This can be wildly misleading if distributions are skewed - as the name suggests, the median test compares ... medians!

What the statisticians say

Conover (1999)  covers the Wilcoxon-Mann-Whitney as the Mann-Whitney test, although he only gives details on the (Wilcoxon) sum-of ranks statistic. Table values of W for nA,nB up to 20 are given. Sprent (1998)  provides a comprehensive treatment of rank tests of location for two independent samples in Chapter 4. Hollander & Wolfe (1973)  and Siegel (1956)  both cover the Wilcoxon-Mann-Whitney test in their texts on nonparametric statistics.
Okeh (2009) reviews the application of the Wilcoxon Mann-Whitney U test in medical research studies. Zimmerman (2003)  warns that the large-sample Wilcoxon-Mann-Whitney test can be strongly influenced by unequal variances of treatment groups even when sample sizes are equal. Hart (2001)  notes that the Wilcoxon-Mann-Whitney test is a test of both location and shape - not as most researchers consider it a test of difference between medians. Freidlin & Gastwirth (2000)  advocate the retirement of the median test from general use, being replaced by the Wilcoxon-Mann-Whitney and related tests. Potvin and Roff (1993)  propose more general use of non-parametric tests in ecological research, but Johnson (1995)  and Smith (1995)  take issue with this point of view.
Wilcoxon (1945)  first proposed the test for equal sample sizes, and then Mann & Whitney (1947)  extended the test to cover different sample sizes. Hodges & Lehmann (1963)  discuss the properties of the Hodges-Lehmann estimator of median difference.
Wikipedia (2008) provides a comprehensive account of the Wilcoxon-Mann-Whitney test with a useful section of its relation to other tests; the median test  and the Hodges-Lehmann estimator  are also covered. Various universities give tables of the Wilcoxon Rank Sum statistic on line.

Comments

Post a Comment