Studentsche t distribution

from Wikipedia, the free encyclopedia
Densities of -distributed random variables

The Student 's t distribution (including Student t distribution or shortly t- distribution ) is a probability distribution , which in 1908 by William Sealy Gosset developed and after his pseudonym student was named.

Gosset had found that the standardized estimator of the sample mean value of normally distributed data is no longer normally distributed, but rather -distributed if the variance of the characteristic required to standardize the mean value is unknown and must be estimated using the sample variance . Its distribution allows - especially for small sample sizes - to calculate the distribution of the difference between the mean of the sample and the true mean of the population .

The values depend on the significance level and the sample size and determine the confidence interval and thus the significance of the estimate of the mean. The distribution becomes narrower with increasing and changes into the normal distribution (see graphic on the right). Hypothesis tests that use the distribution are called t-tests .

The derivation was first published in 1908 when Gosset was working in the Dublin Guinness Brewery . Since his employer did not allow publication, Gosset published it under the pseudonym Student. The t-factor and the related theory were first substantiated by the work of R. A. Fisher , who called the distribution Student's distribution .

The distribution also occurs in earlier publications by other authors. It was first derived in 1876 by Jacob Lüroth as a posteriori distribution when dealing with a problem of equalization , and in 1883 by Edgeworth in a similar context .

definition

A continuous random variable satisfies the Student's distribution with degrees of freedom , when the probability density

owns. It is

the gamma function . For natural numbers the following applies in particular (here the factorial of )

Alternatively, the distribution with degrees of freedom can also be defined as the distribution of size

,

where is a standard normally distributed random variable and an independent, chi-square distributed random variable with degrees of freedom.

distribution

The distribution function can be expressed in closed form as

or as

With

where represents the beta function.

calculates the probability that a randomly distributed variable receives a value less than or equal to .

properties

Let it be a -distributed random variable with degrees of freedom and density .

Turning points

The density has turning points at

Median

The median is

mode

The mode arises too

symmetry

The Student distribution is symmetrical around the 0.

Expected value

For the expected value we get for

The expected value for does not exist.

Variance

The variance results for to

Crookedness

The crookedness is for

Bulges

For the kurtosis bulge and the excess bulge you get for

Moments

For the -th moments and the -th central moments :

Relationship to beta distribution

The integral

is the incomplete beta function

in which

creates the connection to the full beta function. Then is for

With

If t approaches infinity, it tends towards 1. In the limiting case, the numerator and denominator of the above fraction have the same, that is, one obtains

Non-central t distribution

The size

with and as a non-centrality parameter follows the so-called non - central distribution. This distribution is mainly used to determine the β-error in hypothesis tests with a -distributed test variable . Their probability density is:

Some densities of non-central distributions

The bracket with the sum of hypergeometric functions can be written a little easier, so that a shorter alternative expression for the density is created:

where represents a Hermitian polynomial with a negative index with .

The expectation is for at

and the variance (for ) at

With you get the characteristic values ​​of the central distribution.

Relationship to other distributions

Relationship to the Cauchy distribution

For and with , the Cauchy distribution results as a special case from the Student distribution.

Relationship to the chi-square distribution and standard normal distribution

The distribution describes the distribution of an expression

where means a standard normally distributed and a chi-square distributed random variable with degrees of freedom. The numerator variable must be independent of the denominator variable. The density function of the distribution is then symmetrical with respect to its expected value . The values ​​of the distribution function are usually tabulated.

Distribution with heavy margins

The distribution belongs to the distributions with heavy margins .

Approximation by the normal distribution

With an increasing number of degrees of freedom, the distribution values ​​of the distribution can be approximated using the normal distribution. The rule of thumb is that from 30 degrees of freedom the distribution function can be approximated by the normal distribution.

Use in mathematical statistics

Various estimation functions are distributed.

If the independent random variables are identically normally distributed with mean and standard deviation , it can be proven that the sample mean

and the sample variance

are stochastically independent.

Because the random variable has a standard normal distribution and follows a chi-square distribution with degrees of freedom , the result is that the size

by definition -distributed with degrees of freedom.

So the distance between the measured mean and the mean of the population is distributed as . Thus one calculates then the 95 -% - confidence interval for the mean to

where is determined by . This interval is somewhat larger than that which would have resulted from the distribution function of the normal distribution with the same confidence level .

Derivation of the density

The probability density of the distribution can be derived from the common density of the two independent random variables and , which are standard normal or chi-square distributed:

With the transformation

we get the common density of and , where and .

The Jacobide terminant of this transformation is:

The value is not important because it is multiplied by 0 when calculating the determinant. So the new density function is written

We are now looking for the marginal distribution as an integral over the variable that is not of interest :

Selected quantiles of the t distribution

Values ​​for various degrees of freedom and common probabilities (0.75 to 0.999) are tabulated , for which the following applies:

Due to the mirror symmetry of the density, one only needs to adapt the probability scale for the case of the interval symmetrically limited on both sides. The probabilities decrease with the same , because the integration interval is reduced by cutting away the range from to :

If observations are carried out on a sample and parameters are estimated from the sample , the number of degrees of freedom is.

For the number of degrees of freedom in the first column and the level of significance (shown as in the second row), the value of the (one-sided) quantile , according to DIN 1319-3, is given in each cell of the following table . This satisfies the following equations for the density of the distribution:

One sided:
Two-sided:

So for example with and the values ​​of 2.776 (two-sided) or 2.132 (one-sided) are found.

The quantile function of the -distribution is the solution of the equation and can therefore in principle be calculated using the inverse function. Specifically applies here

with as the inverse of the regularized incomplete beta function. This value is entered in the quantile table under the coordinates p and n.

For a few values (1,2,4) the quantile function is simplified:

Table of some t -quantiles

Number of
degrees of freedom
n
P for two-sided confidence interval
0.5 0.75 0.8 0.9 0.95 0.98 0.99 0.998
P for one-sided confidence interval
0.75 0.875 0.90 0.95 0.975 0.99 0.995 0.999
1 1,000 2,414 3.078 6.314 12,706 31,821 63.657 318,309
2 0.816 1.604 1,886 2.920 4.303 6,965 9.925 22,327
3 0.765 1.423 1.638 2.353 3.182 4,541 5,841 10.215
4th 0.741 1,344 1.533 2.132 2,776 3,747 4.604 7.173
5 0.727 1.301 1.476 2.015 2.571 3.365 4.032 5,893
6th 0.718 1.273 1,440 1,943 2,447 3.143 3.707 5.208
7th 0.711 1.254 1.415 1,895 2,365 2.998 3,499 4,785
8th 0.706 1,240 1.397 1,860 2.306 2,896 3.355 4,501
9 0.703 1.230 1.383 1,833 2.262 2.821 3,250 4,297
10 0.700 1,221 1.372 1,812 2.228 2.764 3.169 4.144
11 0.697 1,214 1.363 1,796 2,201 2.718 3.106 4.025
12 0.695 1.209 1.356 1,782 2.179 2,681 3.055 3,930
13 0.694 1.204 1,350 1,771 2.160 2,650 3.012 3.852
14th 0.692 1,200 1,345 1.761 2.145 2.624 2,977 3,787
15th 0.691 1.197 1.341 1.753 2.131 2.602 2.947 3.733
16 0.690 1.194 1.337 1,746 2.120 2.583 2.921 3,686
17th 0.689 1.191 1.333 1,740 2.110 2.567 2,898 3,646
18th 0.688 1.189 1.330 1.734 2.101 2.552 2,878 3,610
19th 0.688 1.187 1.328 1.729 2.093 2.539 2.861 3,579
20th 0.687 1.185 1.325 1.725 2.086 2.528 2.845 3,552
21st 0.686 1.183 1.323 1.721 2.080 2.518 2.831 3.527
22nd 0.686 1.182 1.321 1.717 2.074 2.508 2.819 3.505
23 0.685 1.180 1,319 1.714 2.069 2,500 2.807 3.485
24 0.685 1.179 1.318 1.711 2.064 2,492 2.797 3.467
25th 0.684 1.178 1,316 1.708 2.060 2.485 2.787 3,450
26th 0.684 1.177 1,315 1.706 2.056 2,479 2,779 3.435
27 0.684 1.176 1,314 1.703 2.052 2.473 2.771 3.421
28 0.683 1.175 1,313 1.701 2.048 2,467 2.763 3.408
29 0.683 1.174 1,311 1,699 2.045 2.462 2.756 3.396
30th 0.683 1.173 1,310 1.697 2.042 2.457 2.750 3.385
40 0.681 1.167 1.303 1.684 2.021 2,423 2.704 3.307
50 0.679 1.164 1,299 1.676 2.009 2.403 2.678 3.261
60 0.679 1.162 1.296 1.671 2,000 2,390 2,660 3.232
70 0.678 1.160 1.294 1.667 1.994 2.381 2.648 3.211
80 0.678 1.159 1.292 1.664 1,990 2,374 2,639 3.195
90 0.677 1.158 1.291 1.662 1.987 2,368 2.632 3.183
100 0.677 1.157 1.290 1.660 1.984 2,364 2.626 3.174
200 0.676 1.154 1.286 1.653 1,972 2,345 2.601 3.131
300 0.675 1.153 1.284 1,650 1,968 2,339 2,592 3.118
400 0.675 1.152 1.284 1.649 1,966 2,336 2,588 3.111
500 0.675 1.152 1.283 1.648 1.965 2,334 2.586 3.107
0.674 1.150 1.282 1.645 1,960 2,326 2.576 3.090

Web links

Commons : Student distribution  - collection of images, videos and audio files

Individual evidence

  1. ^ A b Student: The Probable Error of a Mean . In: Biometrika . 6, No. 1, 1908, pp. 1-25. JSTOR 2331554 . doi : 10.1093 / biomet / 6.1.1 .
  2. Josef Bleymüller, Günther Gehlert, Herbert Gülicher: Statistics for economists . 14th edition. Vahlen, 2004, ISBN 978-3-8006-3115-5 , pp. 16 .
  3. J. Pfanzagl, O. Sheynin: A forerunner of the t -Distribution (Studies in the history of probability and statistics XLIV) . In: Biometrika . 83, No. 4, 1996, pp. 891-898. doi : 10.1093 / biomet / 83.4.891 .
  4. P. Gorroochurn: Classic Topics on the History of Modern Mathematical Statistics from Laplace to More Recent Times . Wiley, 2016, doi : 10.1002 / 9781119127963 .
  5. ^ NL Johnson, BL Welch: Applications of the Non-Central t-Distribution. In: Biometrika. Vol. 31, No. 3/4 (Mar. 1940), pp. 362-389, JSTOR 2332616 doi : 10.1093 / biomet / 31.3-4.362 .
  6. Eric W. Weisstein : Noncentral Student's t-Distribution . In: MathWorld (English).
  7. HermiteH. At: functions.wolfram.com.
  8. Frodesen, Skjeggestad, Tofte: Probability and Statistics in Particle Physics. Universitetsforlaget, Bergen / Oslo / Tromsø, p. 141.
  9. ^ WT Shaw: Sampling Student's T distribution - Use of the inverse cumulative distribution function . In: Journal of Computational Finance . 9, No. 4, 2006, pp. 37-73. doi : 10.21314 / JCF.2006.150 .