Behrens-Fisher problem
The Behrens-Fisher problem is a problem of mathematical statistics whose exact solutions have been shown to have undesirable properties, which is why approximations are preferred.
Is searched for a nichtrandomisierter similar test of the null hypothesis of equal expected values , , two normally distributed populations, their variances and are unknown and are not assumed to be equal. The similarity of the test means that the null hypothesis, if it is valid , is rejected with an exact probability , the given level of significance , no matter how large and different the unknown variances and are. For reasons of the power of the test, the following "Behrens-Fisher" test variable is used:
where and are the means and and are the standard deviations of the two samples; with and denotes their respective scope.
The Behrens-Fisher problem generalizes the t-test for two independent samples; this assumes that the variances of the two populations match.
Emergence
In 1935 Ronald Fisher introduced the " fiducial inference " to solve this problem. He was referring to an earlier work by Walter-Ulrich Behrens from 1929. Behrens and Fisher suggested determining the distribution of the above-mentioned test variable .
Fisher approximated this distribution by ignoring the randomness of the relative size . As a result, the resulting test did not have the desired property of having a probability of rejecting the null hypothesis whenever it is true. This sparked a controversy commonly known as the Behrens-Fisher problem.
Non-existence of a desirable solution
Linnik (1968, Theorem 8.3.1) has shown that there is no continuous function for the boundary between acceptance and rejection range of the Behrens-Fisher test variable mentioned above , which only depends on the quotient of the empirical variances of the mean values,, (and of course constants such as , and the significance level ). The boundary between the acceptance and rejection area of any exact solution of the Behrens-Fisher problem is necessarily discontinuous in this quotient. Even more: An exact solution requires that the rejection region of the Behrens-Fisher test variable contain neighborhoods of points for which is an intolerable property (Linnik, 1968). That instead of Linnik and said variance quotient on and relates, is not essential since the latter by means of the problem is described in an equivalent manner.
Best approximation using a non-convergent series approach
One work that Linnik (1968) never mentioned is that of BL Welch (1947). Two decades earlier, Welch (1947), who, like Fisher, worked at University College London , made an approach to the exact solution of the Behrens-Fisher problem, which defines the boundary between the acceptance and rejection range of the test variable as a continuous function in would describe. Welch (1947) gives this limit for a given level of significance, initially for the empirical mean value difference as a function of the empirical variances and in the form of a partial differential equation of infinite order. He also describes the method of approximating the solution as precisely as desired using three Taylor expansions. The series expansion of this function shows that it can be factored into a product of the estimated standard deviation of the mean difference,, and a function that only depends on the variance quotient (and constants). The function standardized according to the test variable depends - as desired - only on the variance quotient . If Welch's series approach converged evenly, so that the function would be infinitely differentiable, i.e. also continuous, this would contradict Linnik's proof that such a function does not exist. It follows that Welch's approach cannot converge uniformly. Graphical representations of the function up to differently developed orders, with very small as well as somewhat larger , and make this conclusion appear quite credible, although for not too small , and the results with regard to the smoothness of and the accuracy of the numerically calculated error probabilities of the first kind are considerable are. Aspin's (1948) development of the series approach Welch to the fourth power in inverse numbers of degrees of freedom provides by far the most accurate approximation, unless , and are much smaller than usual. The resulting Welch-Aspin test is described in detail in Bachmaier (2000) in German.
The approximation in the so-called Welch test
There are several approximate approaches to solving the Behrens-Fisher problem. One of the most widely used approximations (for example in Microsoft Excel ) also comes from Welch. The test based on this Welch approximation is also known as the Welch test .
The variance of the mean difference is . Welch (1938) approximated the distribution by the Pearson curve of type III (a scaled chi-square distribution ) whose first two moments ( expectation and variance ) agree with those of . This applies to the following number of degrees of freedom (df) with generally non-integer values:
If the null hypothesis of equal expected values,, is valid , the distribution of the Behrens-Fisher test variable mentioned at the beginning , which depends a little on the quotient of the standard deviations ,, could be approximated by Student's t-distribution with these degrees of freedom. However, this now also contains the variances of the populations, which are unknown. In the end, the following estimate of the degrees of freedom, which is simply based on the replacement of the population variances with the sample variances, has prevailed:
However, this estimate makes it a random variable. However, there is no t-distribution with a random number of degrees of freedom. However, this does not prevent you from comparing the test variable with corresponding quantile values of the t-distribution with the estimated degrees of freedom. In this way, an infinitely often differentiable function, dependent on the empirical variances, arises as the limit between the acceptance and rejection range of the test variable .
This method doesn't hold the level of significance exactly, but it's not too far from it. Only if the population variances, and , are identical or, in the case of rather small sample sizes, can be assumed to be at least nearly identical, Student's usual t-test is the better choice.
literature
- AA Aspin: An Examination and Further Development of a Formula Arising in the Problem of Comparing Two Mean Values . Biometrika 35, 1948, pp. 88-96, doi : 10.1093 / biomet / 35.1-2.88 JSTOR 2332631 .
- M. Bachmaier: The Behrens-Fisher problem . In: M. Bachmaier: Classic, robust and nonparametric Bartlett tests and robust analysis of variance for heterogeneous scale parameters. Shaker, Aachen 2000, pp. 231–245.
- WU Behrens : A contribution to the calculation of errors with few observations . Agricultural Yearbooks 68, 1929, pp. 807-837.
- RA Fisher: The fiducial argument in statistical inference. Annals of Eugenics 8, 1935, pp. 391-398, doi : 10.1111 / j.1469-1809.1935.tb02120.x .
- Juri Wladimirowitsch Linnik : Statistical problems with nuisance parameters . American Mathematical Society, Providence, Rhode Island, 1968.
- H. Ruben: A simple conservative and robust solution of the Behrens-Fisher problem. In: The Indian Journal of Statistics. Series A, Volume 64, Part 1, 2002, pp. 139-155, JSTOR 25051377 .
- Kam-Wah Tsui, Shijie Tang: Distributional Property of the Generalized p-value for the Behrens-Fisher Problem with Applications to Multiple Testing. , University of Wisconsin, October 31, 2005 (PDF file; 192 kB).
- BL Welch: The Significance of the Difference between Two Means When the Population Variances Are Unequal . Biometrika 29, 1938, pp. 350-362, doi : 10.1093 / biomet / 29.3-4.350 JSTOR 2332010 .
- BL Welch: The Generalization of Student's Problem When Several Different Population Variances Are Involved . Biometrika 34, 1947, pp. 28-35, doi : 10.1093 / biomet / 34.1-2.28 JSTOR 2332510 .