Fisher's exact test

from Wikipedia, the free encyclopedia

The exact Fisher test (Fisher-Yates test, exact chi-squared test ) is a test of significance on independence in contingency tables . In contrast to the chi-square independence test , however, it does not place any requirements on the sample size and delivers reliable results even with a small number of observations. It goes back to the British statistician Ronald Aylmer Fisher . It was originally developed for two dichotomous variables, i.e. for 2x2 contingency tables , but it can also be extended to larger contingency tables.

idea

Expected frequencies if
the null hypothesis is valid.
    Observed frequencies
in the sample.
A. not A A. not A
B. B.
not b not b

Fisher's exact test is an alternative to the chi-square independence test on a 2x2 contingency table. The upper right contingency contains the observed frequencies , , and for the four combinations of features, while the left upper contingency the expected frequencies under the validity of the null hypothesis contains. The value of the test statistic for the chi-square independence test would be

and the associated test statistic would then be approximately -distributed with one degree of freedom if the hypothesis of independence is correct. So that the approximation is valid, but must apply , , and .

Are the four marginal frequencies , , and firmly, but then it is enough to look at one of the cells. As soon as z. If, for example, the value of is fixed, the values ​​for , and, finally, are also fixed due to the fixed marginal frequencies .

Fisher showed that the number of observations in the upper left corner follows a hypergeometric distribution :

.

The unknown marginal distributions are estimated from the sample using their marginal frequencies, so that it follows:

and the likelihood that , arises too

Alternatively, according to Bortz, Lienert and Boehnke (1990), the probability can be written as

If the value of in the sample is too small or too large, then the null hypothesis must be rejected.

method

Probability distribution for for the student example.
Achievements of the students in
a small class
male Female total
enough 3 1 4th
insufficient 2 2 4th
total 5 3

In the example, the independence of student performance from gender cannot be checked for statistical significance using the chi-square test or the four-field test. Fisher's exact test, on the other hand, adheres to the required level even with a few observations .

If you choose z. B. a significance level , the critical values ​​result as 2 or 3, i.e. H. the null hypothesis of independence of student performance from gender cannot be rejected if or is. Is or is , then the null hypothesis can be rejected. In the example , i. H. the null hypothesis that student performance is independent of gender cannot be rejected.

There are also three other tables (see below) for which the sum of the column and row frequencies is equal to the observed values.

male female   male female   male female
enough 1 3 enough 2 2 enough 4th 0
insufficient 4th 0 insufficient 3 1 insufficient 1 3

This example also shows that Fisher's exact test is a conservative test . Because the probability that one incorrectly accepts the alternative hypothesis ( error of the first kind ) results in

,

thus smaller than the given level of significance.

Web links

Wikibooks: Perform Fisher Test with R  - Learning and Teaching Materials

Individual evidence

  1. http://isi.cbs.nl/glossary/term1276.htm
  2. Mehta, CR and Patel, NR (1986) Algorithm 643. FEXACT: A Fortran subroutine for Fisher's exact test on unordered r * c contingency tables. ACM Transactions on Mathematical Software, 12, pp. 154-161, doi : 10.1145 / 6497.214326 .