Wilcoxon-Mann-Whitney test
The Wilcoxon-Mann-Whitney test (also: Mann-Whitney U test , U test , Wilcoxon rank sum test ) is the collective name for two nonparametric statistical tests for rank data ( ordinally scaled data ). They test whether, when considering two populations, it is equally likely that a randomly selected value from one population is greater or smaller than a randomly selected value from the other population. If this hypothesis is rejected, it can be assumed that the values from one population tend to be larger or smaller than those from the other population. The Mann-Whitney U test or Wilcoxon rank sum test is - unlike the median test - not a priori a test for the equality of two medians. This is only the case provided that the form of distribution and scatter of the dependent variable are the same in both groups.
The tests were developed by Henry Mann and Donald Whitney (U-Test, 1947) and Frank Wilcoxon (Wilcoxon Rank Sum Test, 1945), respectively . The central idea of the test was developed in 1914 by the German educator Gustaf Deuchler .
In practice, the Wilcoxon rank sum test or the U-test is used as an alternative to the t-test for independent samples if its prerequisites are violated. This is the case, among other things, if the variable to be tested only has the ordinal scale level, or if interval-scaled variables are not (approximately) normally distributed in the two populations.
The Wilcoxon rank sum test for two independent samples is not to be confused with the Wilcoxon signed rank test , which is used for two connected (paired) samples.
Assumptions
- There are independent samples from and from , which are also independent of one another.
Test statistics
For testing the hypotheses of the Wilcoxon-Mann-Whitney test
there are two test statistics: the Mann-Whitney U statistic and the Wilcoxon rank sum statistic . Because of the relationship between the test statistics
the Wilcoxon rank sum test and the Mann-Whitney U test are equivalent.
Mann-Whitney U Statistics
The Mann-Whitney U test statistic is
- ,
where is , if , if , and otherwise . Depending on the alternative hypothesis, the null hypothesis is rejected for too small or too large values of . This is the form found in Mann and Whitney and is often referred to as the Mann-Whitney U test .
Exact critical values
Exact critical values are only available in tabular form and can be taken from the table below for small sample sizes ( for the two-sided test and the one-sided test ).
There is a recursion formula that allows the critical values for small sample sizes to be determined step-by-step and with little computing time.
Approximate critical values
For , and can
can be approximated by the normal distribution. The critical values then result from the critical values of the approximate normal distribution.
Wilcoxon rank sum statistics
The Wilcoxon rank sum statistic is
with the rank of the ith X in the pooled, ordered sample. In this form, the test is often called the Wilcoxon rank sum test .
Exact critical values
The exact distribution of under the condition of the null hypothesis can easily be found by means of combinatorial considerations. However, the computational effort for large values increases rapidly from. The exact critical values for the significance level can be calculated using a recursion formula:
- (or or or )
The formula arises when one conditioned on the condition whether the last value in the arrangement is an X (... X) or a Y (... Y).
Approximate critical values
For or (also: or ) the test statistic
can be approximated by the normal distribution . The critical values then result from the critical values of the approximate normal distribution.
One-sided hypotheses
The test can also be used for the one-sided hypotheses
- or.
be formulated.
Derived hypotheses
The test is particularly interesting because if the null or alternative hypothesis is accepted or rejected, the following null and alternative hypotheses (under the conditions listed below) can also be accepted or rejected:
- ,
d. H. the mean values of the distributions A and B differ.
- ,
d. H. the medians of the distributions A and B differ.
Requirements:
- The random variables and have continuous distribution functions or , which differ from each other only by one shift , that is:
- .
- Because the two distribution functions are the same except for the shift, (homogeneity of variance) must apply in particular . I.e. if the homogeneity of variance is rejected by the Bartlett test or Levene test , the two random variables X and Y differ not only in terms of a shift.
If the prerequisites for the hypothesis about the medians are not met, the median test can be used .
example
From the data of the General Population Survey of the Social Sciences 2006, 20 people were randomly drawn and their net income was determined:
rank | 1 | 2 | 3 | 4th | 5 | 6th | 7th | 8th | 9 | 10 | 11 | 12 | 13 | 14th | 15th | 16 | 17th | 18th | 19th | 20th |
Net income | 0 | 400 | 500 | 550 | 600 | 650 | 750 | 800 | 900 | 950 | 1000 | 1100 | 1200 | 1500 | 1600 | 1800 | 1900 | 2000 | 2200 | 3500 |
gender | M. | W. | M. | W. | M. | W. | M. | M. | W. | W. | M. | M. | W. | M. | W. | M. | M. | M. | M. | M. |
You have two samples in front of you, sample of men with values and sample of women with values. We could now check whether the income of men and women is equal (two-sided test) or the income of women is less (one-sided test) with the distribution function of the income of men and the distribution function of the income of women. We look at the tests here
Two-sided test | One-sided test |
---|---|
First, a test variable is formed from both series of numbers :
and are the numbers of values per sample, and are the respective sums of all ranking numbers per sample. (If several values are identical in both data sets, the median or the arithmetic mean must be entered for their ranks .) For the following tests, the minimum of and is required .
For our example we get (index M = men, W = women)
- and .
- and and
- .
If the calculation is correct, or must apply . The test variable is now compared with the critical value (s). The example has been chosen so that a comparison with the exact critical values as well as with the approximate values is possible.
Two-sided test
Exact critical values
Using the table below, with and a critical value of for a significance level of . The null hypothesis is rejected if is; but this is not the case here.
Approximate critical values
Since the test statistic is distributed approximately normally, it follows that the
is distributed. For a significance level of the non-rejection region of the null hypothesis in the two-sided test by 2.5% is - and 97.5% quantile of the standardized normal distribution with . It turns out , however , i. H. the test value is within the interval and the null hypothesis cannot be rejected.
One-sided test
Exact critical values
Based on the table below, with and a critical value of for a significance level of ( different significance level than in the two-sided test! ). The null hypothesis is rejected if is; but this is not the case here.
Approximate critical values
For a significance level of , the critical value results as the 5% quantile of the standard normal distribution and the non-rejection range of the null hypothesis as . It turns out , however , i. H. the null hypothesis cannot be rejected.
Table of critical values of the Mann-Whitney U statistic
The following table is valid for (two-sided) or (one-sided) with . The entry “-” means that the null hypothesis cannot be rejected in any case at the given level of significance. E.g. is:
1 | 2 | 3 | 4th | 5 | 6th | 7th | 8th | 9 | 10 | 11 | 12 | 13 | 14th | 15th | 16 | 17th | 18th | 19th | 20th | 21st | 22nd | 23 | 24 | 25th | 26th | 27 | 28 | 29 | 30th | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 |
2 | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 4th | 4th | 4th | 4th | 5 | 5 | 5 | 5 | 5 | 6th | 6th | 6th | 6th | 7th | 7th | |
3 | - | - | 0 | 1 | 1 | 2 | 2 | 3 | 3 | 4th | 4th | 5 | 5 | 6th | 6th | 7th | 7th | 8th | 8th | 9 | 9 | 10 | 10 | 11 | 11 | 12 | 13 | 13 | 14th | 14th | 15th | 15th | 16 | 16 | 17th | 17th | 18th | 18th | ||
4th | 0 | 1 | 2 | 3 | 4th | 4th | 5 | 6th | 7th | 8th | 9 | 10 | 11 | 11 | 12 | 13 | 14th | 15th | 16 | 17th | 17th | 18th | 19th | 20th | 21st | 22nd | 23 | 24 | 24 | 25th | 26th | 27 | 28 | 29 | 30th | 31 | 31 | |||
5 | 2 | 3 | 5 | 6th | 7th | 8th | 9 | 11 | 12 | 13 | 14th | 15th | 17th | 18th | 19th | 20th | 22nd | 23 | 24 | 25th | 27 | 28 | 29 | 30th | 32 | 33 | 34 | 35 | 37 | 38 | 39 | 40 | 41 | 43 | 44 | 45 | ||||
6th | 5 | 6th | 8th | 10 | 11 | 13 | 14th | 16 | 17th | 19th | 21st | 22nd | 24 | 25th | 27 | 29 | 30th | 32 | 33 | 35 | 37 | 38 | 40 | 42 | 43 | 45 | 46 | 48 | 50 | 51 | 53 | 55 | 56 | 58 | 59 | |||||
7th | 8th | 10 | 12 | 14th | 16 | 18th | 20th | 22nd | 24 | 26th | 28 | 30th | 32 | 34 | 36 | 38 | 40 | 42 | 44 | 46 | 48 | 50 | 52 | 54 | 56 | 58 | 60 | 62 | 64 | 66 | 68 | 70 | 72 | 74 | ||||||
8th | 13 | 15th | 17th | 19th | 22nd | 24 | 26th | 29 | 31 | 34 | 36 | 38 | 41 | 43 | 45 | 48 | 50 | 53 | 55 | 57 | 60 | 62 | 65 | 67 | 69 | 72 | 74 | 77 | 79 | 81 | 84 | 86 | 89 | |||||||
9 | 17th | 20th | 23 | 26th | 28 | 31 | 34 | 37 | 39 | 42 | 45 | 48 | 50 | 53 | 56 | 59 | 62 | 64 | 67 | 70 | 73 | 76 | 78 | 81 | 84 | 87 | 89 | 92 | 95 | 98 | 101 | 103 | ||||||||
10 | 23 | 26th | 29 | 33 | 36 | 39 | 42 | 45 | 48 | 52 | 55 | 58 | 61 | 64 | 67 | 71 | 74 | 77 | 80 | 83 | 87 | 90 | 93 | 96 | 99 | 103 | 106 | 109 | 112 | 115 | 119 | |||||||||
11 | 30th | 33 | 37 | 40 | 44 | 47 | 51 | 55 | 58 | 62 | 65 | 69 | 73 | 76 | 80 | 83 | 87 | 90 | 94 | 98 | 101 | 105 | 108 | 112 | 116 | 119 | 123 | 127 | 130 | 134 | ||||||||||
12 | 37 | 41 | 45 | 49 | 53 | 57 | 61 | 65 | 69 | 73 | 77 | 81 | 85 | 89 | 93 | 97 | 101 | 105 | 109 | 113 | 117 | 121 | 125 | 129 | 133 | 137 | 141 | 145 | 149 | |||||||||||
13 | 45 | 50 | 54 | 59 | 63 | 67 | 72 | 76 | 80 | 85 | 89 | 94 | 98 | 102 | 107 | 111 | 116 | 120 | 125 | 129 | 133 | 138 | 142 | 147 | 151 | 156 | 160 | 165 | ||||||||||||
14th | 55 | 59 | 64 | 69 | 74 | 78 | 83 | 88 | 93 | 98 | 102 | 107 | 112 | 117 | 122 | 127 | 131 | 136 | 141 | 146 | 151 | 156 | 161 | 165 | 170 | 175 | 180 | |||||||||||||
15th | 64 | 70 | 75 | 80 | 85 | 90 | 96 | 101 | 106 | 111 | 117 | 122 | 127 | 132 | 138 | 143 | 148 | 153 | 159 | 164 | 169 | 174 | 180 | 185 | 190 | 196 | ||||||||||||||
16 | 75 | 81 | 86 | 92 | 98 | 103 | 109 | 115 | 120 | 126 | 132 | 137 | 143 | 149 | 154 | 160 | 166 | 171 | 177 | 183 | 188 | 194 | 200 | 206 | 211 | |||||||||||||||
17th | 87 | 93 | 99 | 105 | 111 | 117 | 123 | 129 | 135 | 141 | 147 | 154 | 160 | 166 | 172 | 178 | 184 | 190 | 196 | 202 | 209 | 215 | 221 | 227 | ||||||||||||||||
18th | 99 | 106 | 112 | 119 | 125 | 132 | 138 | 145 | 151 | 158 | 164 | 171 | 177 | 184 | 190 | 197 | 203 | 210 | 216 | 223 | 230 | 236 | 243 | |||||||||||||||||
19th | 113 | 119 | 126 | 133 | 140 | 147 | 154 | 161 | 168 | 175 | 182 | 189 | 196 | 203 | 210 | 217 | 224 | 231 | 238 | 245 | 252 | 258 | ||||||||||||||||||
20th | 127 | 134 | 141 | 149 | 156 | 163 | 171 | 178 | 186 | 193 | 200 | 208 | 215 | 222 | 230 | 237 | 245 | 252 | 259 | 267 | 274 |
implementation
In many software packages, the Mann-Whitney-Wilcoxon test (the hypothesis of equal distributions versus suitable alternatives) is poorly documented. Some packages mishandle bindings or fail to document asymptotic techniques (e.g., fix for continuity). During a review in 2000, some of the following packages were discussed:
- MATLAB has a rank sum test (ranksum) ranksum function in its Statistics Toolbox .
-
R implements the test in its "stats"
wilcox.test
package. - SAS implements the test in its PROC NPAR1WAY procedure.
- Python (programming language) has an implementation of this test via SciPy
- SigmaStat (SPSS Inc., Chicago, IL)
- SYSTAT (SPSS Inc., Chicago, IL)
- Java implements the test via Apache Commons
- JMP (SAS Institute Inc., Cary, NC)
- S-Plus (MathSoft, Inc., Seattle, WA)
- STATISTICA (StatSoft, Inc., Tulsa, OK)
- UNISTAT (Unistat Ltd, London)
- SPSS (SPSS Inc, Chicago)
- StatsDirect (StatsDirect Ltd, Manchester, UK) implements the test via Analysis_Nonparametric_Mann-Whitney .
- Stata (Stata Corporation, College Station, TX) implements the test in its ranksum command.
- StatXact (Cytel Software Corporation, Cambridge, Massachusetts).
- PSPP implements the test in its WILCOXON function.
Individual evidence
- ^ Frank Wilcoxon: Individual Comparisons by Ranking Methods. In: Biometrics Bulletin. 1, 1945, pp. 80-83, JSTOR 3001968 .
- ^ Henry Mann, Donald Whitney: On a test of whether one of two random variables is stochastically larger than the other. In: Annals of mathematical Statistics. 18, 1947, pp. 50-60, doi: 10.1214 / aoms / 1177730491 .
- ^ William H. Kruskal: Historical Notes on the Wilcoxon Unpaired Two-Sample Test. In: Journal of the American Statistical Association. Vol. 52, 1957, pp. 356-360, JSTOR 2280906
- ↑ A. Löffler: About a partition of natural numbers and their application in the U-test. In: Wiss. Z. Univ. Hall. Volume XXXII, Issue 5 1983, pp. 87-89. (lms.fu-berlin.de)
- ↑ B. Rönz, HG Strohe (Ed.): Lexicon Statistics. Gabler, Wiesbaden 1994, ISBN 3-409-19952-7 .
- ^ H. Rinne: Pocket book of statistics. 3. Edition. Verlag Harri Deutsch, 2003, p. 534.
- ^ S. Kotz, CB Read, N. Balakrishnan: Encyclopedia of Statistical Sciences. Wiley, Volume?, 2003, p. 208.
- ↑ Reinhard Bergmann, John Ludbrook, Will PJM Spooren: Different Outcomes of the Wilcoxon-Mann-Whitney test from Different Statistics packages . In: The American Statistician . tape 54 , no. 1 , 2000, pp. 72-77 , doi : 10.1080 / 00031305.2000.10474513 , JSTOR : 2685616 (English).
- ↑ scipy.stats.mannwhitneyu . In: SciPy v0.16.0 Reference Guide . The Scipy community. July 24, 2015 .: "scipy.stats.mannwhitneyu (x, y, use_continuity = True): Computes the Mann – Whitney rank test on samples x and y."
- ↑ org.apache.commons.math3.stat.inference.MannWhitneyUTest .
literature
- Herbert Büning, Götz Trenkler: Nonparametric statistical methods. de Gruyter, 1998, ISBN 3-11-016351-9 .
- Sidney Siegel: Nonparametric Statistical Methods. 2nd Edition. Specialized bookstore for psychology, Eschborn near Frankfurt am Main 1985, ISBN 3-88074-102-6 .
Web links
- Social Science Statistics Mann-Whitney test (ability to calculate values)
- VassarStats Mann-Whitney test (English, possibility of calculating values)
- Mann-Whitney U test (Engl.)