Simpson's paradox

from Wikipedia, the free encyclopedia
Graphic representation of the Simpson paradox: of the vectors labeled with 1, the red one has the greater slope, exactly as with those labeled with 2. Nevertheless, the vector sum of the red vectors has a smaller slope than that of the blue ones.

The Simpson paradox (also Simpson's paradox or Simpson's paradox , named after Edward Hugh Simpson ) is a paradox from statistics . It seems that the evaluation of different groups turns out differently, depending on whether the results of the groups are combined or not. This phenomenon often occurs in statistical evaluations in the social sciences and medicine . The Simpson paradox is possible when several four-field tables with an odds ratio smaller (greater) than 1 are combined to form an overall table that has an odds quotient greater (smaller) than 1.

history

Edward Hugh Simpson described the phenomenon in 1951, but he was not the first to look at it. In 1899 Karl Pearson et al. and in 1903 George Udny Yule made a similar statement. The term Simpson Paradox ( English Simpson's Paradox ) was probably introduced in 1972 by Colin R. Blyth.

Examples

An exam

A driving school has two test days with the following results:

  male Female
  passed total Failure rate passed total Failure rate
1 day 1 1 0% 7th 8th 12.5%
2 day 2 3 33.3% 1 2 50%
total 3 4th 25% 8th 10 20%

Although the men had a lower failure rate than the women on both days, they had a higher overall result.

The reason for this is the fact that the individual results are included in the overall result with different weights. This can easily be seen in the numerically pointed variant of the table above, which is reproduced below:

  male Female
  passed total Failure rate passed total Failure rate
1 day 1 1 0% 999 1000 0.1%
2 day 2 3 33.3% 1 2 50%
total 3 4th 25% 1000 1002 0.2%

Discrimination lawsuit against the University of Berkeley

One of the most famous cases of the Simpson Paradox was revealed in a study of graduate school admissions at the University of California, Berkeley . The figures for autumn 1973 showed that more men than women were admitted - the difference was so great that it could no longer be explained by chance ( significance test ):

Applicants of which approved
Men 8442 44%
Women 4321 35%

So a man has a 44 percent chance of being admitted to study, but a woman only a 35 percent chance.

However, the breakdown by faculty showed that women were hardly discriminated against in any significant way. Of 101 departments at the university, 16 had only successful applicants or applicants of only one gender. This picture emerged for the other 85 departments:

  • In four departments there were success rates for men that were significantly better than those for women.
  • In six departments, women enjoyed a significantly better success rate.

A chi-square test clearly shows that the applications from women and men were not randomly distributed among the 101 departments from the outset (χ = 3091; p <0.0001).

This led to the explanation that there was no discrimination, but that women tended to apply where there were lower admission rates for both sexes, while men tended to send their applications where there were generally higher admission rates. The original interpretation of the overall success rate of 44 versus 35 percent ignores this.

Undiscovered influencing factors

If there are significantly different results depending on the assessment method, this can be attributed to non-recorded influencing factors. If evaluators want to avoid possible false conclusions, they have to find these influencing factors, as far as they are available. The presence of a Simpson paradox can serve as an indicator here.

One method for searching for further influencing factors is the separate evaluation of subgroups in which one expects specific behavior, for example the patient's disease stage. In the above example from Berkeley, these would be the subgroups departments with low admission rates and departments with high admission rates .

literature

  • Hans-Peter Beck-Bornholdt: With certainty bordering on probability. Logical thinking and chance. Rowohlt, Reinbek near Hamburg 2005, ISBN 3-499-61902-4 .
  • Thomas R. Knapp: Instances of Simpson's paradox. In: College Mathematics Journal. Volume 16 (1985), pp. 209-211, doi : 10.1080 / 07468342.1985.11972882 , JSTOR 2686573 .
  • Walter Krämer: Think! Fallacies from the world of numbers and chance. Piper Verlag, Munich 2011, ISBN 978-3-492-26460-0 . Chapter 7, pp. 161–186 (The basic trap and other fallacies from conditional probabilities).
  • Edward H. Simpson: The Interpretation of Interaction in Contingency Tables. In: Journal of the Royal Statistical Society. Series B. Vol. 13, No. 2, 1951, pp. 238-241, doi : 10.1111 / j.2517-6161.1951.tb00088.x , JSTOR 2984065 .
  • Clifford H. Wagner: Simpson's Paradox in Real Life. In: The American Statistician. Vol. 36, No. 1, 1982, pp. 46-48, doi : 10.1080 / 00031305.1982.10482778 , JSTOR 2684093 .
  • Howard Wainer: Minority contributions to the SAT score turnaround: an example of Simpson's paradox. In: Journal of Educational Statistics. Volume 11 (1986), pp. 239-244, doi : 10.3102 / 10769986011004239 , JSTOR 1164696 .

Web links

Footnotes and individual references

  1. ^ Edward Hugh Simpson: The Interpretation of Interaction in Contingency Tables . In: Journal of the Royal Statistical Society, Ser. B . tape 13 , 1951, pp. 238-241 , doi : 10.1111 / j.2517-6161.1951.tb00088.x , JSTOR : 2984065 .
  2. ^ Karl Pearson; Alice Lee; Leslie Bramley-Moore: Mathematical Contributions to the Theory of Evolution - VI. Genetic (Reproductive) Selection: Inheritance of Fertility in Man, and of Fecundity in Thoroughbred Race-Horses . In: Philosophical Transactions of the Royal Society, Series A . tape 192 , 1899, pp. 257-330 , doi : 10.1098 / rsta.1899.0006 .
  3. George Udny Yule: Notes on the Theory of Association of Attributes in Statistics . In: Biometrika . tape 2 , 1903, p. 121-134 , doi : 10.1093 / biomet / 2.2.121 , JSTOR : 2331677 .
  4. ^ Colin R. Blyth: On Simpson's Paradox and the Sure-Thing Principle . In: Journal of the American Statistical Association . tape 67 , no. 338 , 1972, p. 364-366 , doi : 10.1080 / 01621459.1972.10482387 , JSTOR : 2284382 .
  5. P. J. Bickel; E. A. Hammel; J. W. O'Connell: Sex Bias in Graduate Admissions: Data from Berkeley . In: Science 187 (1975), No. 4175, pp. 398-404 doi : 10.1126 / science.187.4175.398