Simpson's paradox
![](https://upload.wikimedia.org/wikipedia/commons/thumb/9/9d/Simpsons-vector.svg/220px-Simpsons-vector.svg.png)
The Simpson paradox (also Simpson's paradox or Simpson's paradox , named after Edward Hugh Simpson ) is a paradox from statistics . It seems that the evaluation of different groups turns out differently, depending on whether the results of the groups are combined or not. This phenomenon often occurs in statistical evaluations in the social sciences and medicine . The Simpson paradox is possible when several four-field tables with an odds ratio smaller (greater) than 1 are combined to form an overall table that has an odds quotient greater (smaller) than 1.
history
Edward Hugh Simpson described the phenomenon in 1951, but he was not the first to look at it. In 1899 Karl Pearson et al. and in 1903 George Udny Yule made a similar statement. The term Simpson Paradox ( English Simpson's Paradox ) was probably introduced in 1972 by Colin R. Blyth.
Examples
An exam
A driving school has two test days with the following results:
male | Female | |||||
---|---|---|---|---|---|---|
passed | total | Failure rate | passed | total | Failure rate | |
1 day | 1 | 1 | 0% | 7th | 8th | 12.5% |
2 day | 2 | 3 | 33.3% | 1 | 2 | 50% |
total | 3 | 4th | 25% | 8th | 10 | 20% |
Although the men had a lower failure rate than the women on both days, they had a higher overall result.
The reason for this is the fact that the individual results are included in the overall result with different weights. This can easily be seen in the numerically pointed variant of the table above, which is reproduced below:
male | Female | |||||
---|---|---|---|---|---|---|
passed | total | Failure rate | passed | total | Failure rate | |
1 day | 1 | 1 | 0% | 999 | 1000 | 0.1% |
2 day | 2 | 3 | 33.3% | 1 | 2 | 50% |
total | 3 | 4th | 25% | 1000 | 1002 | 0.2% |
Discrimination lawsuit against the University of Berkeley
One of the most famous cases of the Simpson Paradox was revealed in a study of graduate school admissions at the University of California, Berkeley . The figures for autumn 1973 showed that more men than women were admitted - the difference was so great that it could no longer be explained by chance ( significance test ):
Applicants | of which approved | |
---|---|---|
Men | 8442 | 44% |
Women | 4321 | 35% |
So a man has a 44 percent chance of being admitted to study, but a woman only a 35 percent chance.
However, the breakdown by faculty showed that women were hardly discriminated against in any significant way. Of 101 departments at the university, 16 had only successful applicants or applicants of only one gender. This picture emerged for the other 85 departments:
- In four departments there were success rates for men that were significantly better than those for women.
- In six departments, women enjoyed a significantly better success rate.
A chi-square test clearly shows that the applications from women and men were not randomly distributed among the 101 departments from the outset (χ = 3091; p <0.0001).
This led to the explanation that there was no discrimination, but that women tended to apply where there were lower admission rates for both sexes, while men tended to send their applications where there were generally higher admission rates. The original interpretation of the overall success rate of 44 versus 35 percent ignores this.
Undiscovered influencing factors
If there are significantly different results depending on the assessment method, this can be attributed to non-recorded influencing factors. If evaluators want to avoid possible false conclusions, they have to find these influencing factors, as far as they are available. The presence of a Simpson paradox can serve as an indicator here.
One method for searching for further influencing factors is the separate evaluation of subgroups in which one expects specific behavior, for example the patient's disease stage. In the above example from Berkeley, these would be the subgroups departments with low admission rates and departments with high admission rates .
literature
- Hans-Peter Beck-Bornholdt: With certainty bordering on probability. Logical thinking and chance. Rowohlt, Reinbek near Hamburg 2005, ISBN 3-499-61902-4 .
- Thomas R. Knapp: Instances of Simpson's paradox. In: College Mathematics Journal. Volume 16 (1985), pp. 209-211, doi : 10.1080 / 07468342.1985.11972882 , JSTOR 2686573 .
- Walter Krämer: Think! Fallacies from the world of numbers and chance. Piper Verlag, Munich 2011, ISBN 978-3-492-26460-0 . Chapter 7, pp. 161–186 (The basic trap and other fallacies from conditional probabilities).
- Edward H. Simpson: The Interpretation of Interaction in Contingency Tables. In: Journal of the Royal Statistical Society. Series B. Vol. 13, No. 2, 1951, pp. 238-241, doi : 10.1111 / j.2517-6161.1951.tb00088.x , JSTOR 2984065 .
- Clifford H. Wagner: Simpson's Paradox in Real Life. In: The American Statistician. Vol. 36, No. 1, 1982, pp. 46-48, doi : 10.1080 / 00031305.1982.10482778 , JSTOR 2684093 .
- Howard Wainer: Minority contributions to the SAT score turnaround: an example of Simpson's paradox. In: Journal of Educational Statistics. Volume 11 (1986), pp. 239-244, doi : 10.3102 / 10769986011004239 , JSTOR 1164696 .
Web links
- Entry in Edward N. Zalta (Ed.): Stanford Encyclopedia of Philosophy .
- Judea Pearl : Simpsons′s Paradox: An Anatomy . University of California, 1999, pp. 1–11 ( ucla.edu [PDF; accessed October 16, 2007]).
- Ulrich Kühne: Blinded by numbers - Simpson's paradox: apparently clear relationships are turned into their opposite. A warning against the naive trust in statistics . Friday No. 42, 2009, p. 18 ( freitag.de [accessed April 17, 2010]).
- Björn Christensen & Sören Christensen: Simpsons Paradox: This statistic cannot be true. Or is it? In: Spiegel Online . December 18, 2015, accessed May 20, 2019 .
Footnotes and individual references
- ^ Edward Hugh Simpson: The Interpretation of Interaction in Contingency Tables . In: Journal of the Royal Statistical Society, Ser. B . tape 13 , 1951, pp. 238-241 , doi : 10.1111 / j.2517-6161.1951.tb00088.x , JSTOR : 2984065 .
- ^ Karl Pearson; Alice Lee; Leslie Bramley-Moore: Mathematical Contributions to the Theory of Evolution - VI. Genetic (Reproductive) Selection: Inheritance of Fertility in Man, and of Fecundity in Thoroughbred Race-Horses . In: Philosophical Transactions of the Royal Society, Series A . tape 192 , 1899, pp. 257-330 , doi : 10.1098 / rsta.1899.0006 .
- ↑ George Udny Yule: Notes on the Theory of Association of Attributes in Statistics . In: Biometrika . tape 2 , 1903, p. 121-134 , doi : 10.1093 / biomet / 2.2.121 , JSTOR : 2331677 .
- ^ Colin R. Blyth: On Simpson's Paradox and the Sure-Thing Principle . In: Journal of the American Statistical Association . tape 67 , no. 338 , 1972, p. 364-366 , doi : 10.1080 / 01621459.1972.10482387 , JSTOR : 2284382 .
- ↑ P. J. Bickel; E. A. Hammel; J. W. O'Connell: Sex Bias in Graduate Admissions: Data from Berkeley . In: Science 187 (1975), No. 4175, pp. 398-404 doi : 10.1126 / science.187.4175.398