Densities of -distributed random variables
The Student 's t distribution (including Student t distribution or shortly t- distribution ) is a probability distribution , which in 1908 by William Sealy Gosset developed and after his pseudonym student was named.
Gosset had found that the standardized estimator of the sample mean value of normally distributed data is no longer normally distributed, but rather -distributed if the variance of the characteristic required to standardize the mean value is unknown and must be estimated using the sample variance . Its distribution allows - especially for small sample sizes - to calculate the distribution of the difference between the mean of the sample and the true mean of the population .
The values depend on the significance level and the sample size and determine the confidence interval and thus the significance of the estimate of the mean. The distribution becomes narrower with increasing and changes into the normal distribution (see graphic on the right). Hypothesis tests that use the distribution are called t-tests .
The derivation was first published in 1908 when Gosset was working in the Dublin Guinness Brewery . Since his employer did not allow publication, Gosset published it under the pseudonym Student. The t-factor and the related theory were first substantiated by the work of R. A. Fisher , who called the distribution Student's distribution .
The distribution also occurs in earlier publications by other authors. It was first derived in 1876 by Jacob Lüroth as a posteriori distribution when dealing with a problem of equalization , and in 1883 by Edgeworth in a similar context .
definition
A continuous random variable satisfies the Student's distribution with degrees of freedom ,
when the probability density
owns. It is
the gamma function . For natural numbers the following applies in particular (here the factorial of )
Alternatively, the distribution with degrees of freedom can also be defined as the distribution of size
-
,
where is a standard normally distributed random variable and an independent, chi-square distributed random variable with degrees of freedom.
distribution
The distribution function can be expressed in closed form as
or as
With
where represents the beta function.
calculates the probability that a randomly distributed variable receives a value less than or equal to .
properties
Let it be a -distributed random variable with degrees of freedom and density .
Turning points
The density has turning points at
Median
The median is
mode
The mode arises too
symmetry
The Student distribution is symmetrical around the 0.
Expected value
For the expected value we get for
The expected value for does not exist.
Variance
The variance results for to
Crookedness
The crookedness is for
Bulges
For the kurtosis bulge and the excess bulge you get for
Moments
For the -th moments and the -th central moments :
Relationship to beta distribution
The integral
is the incomplete beta function
in which
-
creates the connection to the full beta function. Then is for
With
If t approaches infinity, it tends towards 1. In the limiting case, the numerator and denominator of the above fraction have the same, that is, one obtains
Non-central t distribution
The size
with and as a non-centrality parameter follows the so-called non - central distribution. This distribution is mainly used to determine the β-error in hypothesis tests with a -distributed test variable . Their probability density is:
Some densities of non-central distributions
The bracket with the sum of hypergeometric functions can be written a little easier, so that a shorter alternative expression for the density is created:
where represents a Hermitian polynomial with a negative index with .
The expectation is for at
and the variance (for ) at
With you get the characteristic values of the central distribution.
Relationship to other distributions
Relationship to the Cauchy distribution
For and with , the Cauchy distribution results as a special case from the Student distribution.
Relationship to the chi-square distribution and standard normal distribution
The distribution describes the distribution of an expression
where means a standard normally distributed and a chi-square distributed random variable with degrees of freedom. The numerator variable must be independent of the denominator variable. The density function of the distribution is then symmetrical with respect to its expected value . The values of the distribution function are usually tabulated.
Distribution with heavy margins
The distribution belongs to the distributions with heavy margins .
Approximation by the normal distribution
With an increasing number of degrees of freedom, the distribution values of the distribution can be approximated using the normal distribution. The rule of thumb is that from 30 degrees of freedom the distribution function can be approximated by the normal distribution.
Use in mathematical statistics
Various estimation functions are distributed.
If the independent random variables are identically normally distributed with mean and standard deviation , it can be proven that the sample mean
and the sample variance
are stochastically independent.
Because the random variable has a standard normal distribution and follows a chi-square distribution with degrees of freedom , the result is that the size
by definition -distributed with degrees of freedom.
So the distance between the measured mean and the mean of the population is distributed as . Thus one calculates then the 95 -% - confidence interval for the mean to
where is determined by . This interval is somewhat larger than that which would have resulted from the distribution function of the normal distribution with the same confidence level .
Derivation of the density
The probability density of the distribution can be derived from the common density of the two independent random variables and , which are standard normal or chi-square distributed:
With the transformation
we get the common density of and , where and .
The Jacobide terminant of this transformation is:
The value is not important because it is multiplied by 0 when calculating the determinant. So the new density function is written
We are now looking for the marginal distribution as an integral over the variable that is not of interest :
Selected quantiles of the t distribution
Values for various degrees of freedom and common probabilities (0.75 to 0.999) are tabulated , for which the following applies:
Due to the mirror symmetry of the density, one only needs to adapt the probability scale for the case of the interval symmetrically limited on both sides. The probabilities decrease with the same , because the integration interval is reduced by cutting away the range from to :
If observations are carried out on a sample and parameters are estimated from the sample , the number of degrees of freedom is.
For the number of degrees of freedom in the first column and the level of significance (shown as in the second row), the value of the (one-sided) quantile , according to DIN 1319-3, is given in each cell of the following table . This satisfies the following equations for the density of the distribution:
- One sided:
- Two-sided:
So for example with and the values of 2.776 (two-sided) or 2.132 (one-sided) are found.
The quantile function of the -distribution is the solution of the equation and can therefore in principle be calculated using the inverse function. Specifically applies here
with as the inverse of the regularized incomplete beta function. This value is entered in the quantile table under the coordinates p and n.
For a few values (1,2,4) the quantile function is simplified:
Table of some t -quantiles
Number of degrees of freedom n
|
P for two-sided confidence interval
|
0.5
|
0.75
|
0.8
|
0.9
|
0.95
|
0.98
|
0.99
|
0.998
|
P for one-sided confidence interval
|
0.75
|
0.875
|
0.90
|
0.95
|
0.975
|
0.99
|
0.995
|
0.999
|
1
|
1,000
|
2,414
|
3.078
|
6.314
|
12,706
|
31,821
|
63.657
|
318,309
|
2
|
0.816
|
1.604
|
1,886
|
2.920
|
4.303
|
6,965
|
9.925
|
22,327
|
3
|
0.765
|
1.423
|
1.638
|
2.353
|
3.182
|
4,541
|
5,841
|
10.215
|
4th
|
0.741
|
1,344
|
1.533
|
2.132
|
2,776
|
3,747
|
4.604
|
7.173
|
5
|
0.727
|
1.301
|
1.476
|
2.015
|
2.571
|
3.365
|
4.032
|
5,893
|
6th
|
0.718
|
1.273
|
1,440
|
1,943
|
2,447
|
3.143
|
3.707
|
5.208
|
7th
|
0.711
|
1.254
|
1.415
|
1,895
|
2,365
|
2.998
|
3,499
|
4,785
|
8th
|
0.706
|
1,240
|
1.397
|
1,860
|
2.306
|
2,896
|
3.355
|
4,501
|
9
|
0.703
|
1.230
|
1.383
|
1,833
|
2.262
|
2.821
|
3,250
|
4,297
|
10
|
0.700
|
1,221
|
1.372
|
1,812
|
2.228
|
2.764
|
3.169
|
4.144
|
11
|
0.697
|
1,214
|
1.363
|
1,796
|
2,201
|
2.718
|
3.106
|
4.025
|
12
|
0.695
|
1.209
|
1.356
|
1,782
|
2.179
|
2,681
|
3.055
|
3,930
|
13
|
0.694
|
1.204
|
1,350
|
1,771
|
2.160
|
2,650
|
3.012
|
3.852
|
14th
|
0.692
|
1,200
|
1,345
|
1.761
|
2.145
|
2.624
|
2,977
|
3,787
|
15th
|
0.691
|
1.197
|
1.341
|
1.753
|
2.131
|
2.602
|
2.947
|
3.733
|
16
|
0.690
|
1.194
|
1.337
|
1,746
|
2.120
|
2.583
|
2.921
|
3,686
|
17th
|
0.689
|
1.191
|
1.333
|
1,740
|
2.110
|
2.567
|
2,898
|
3,646
|
18th
|
0.688
|
1.189
|
1.330
|
1.734
|
2.101
|
2.552
|
2,878
|
3,610
|
19th
|
0.688
|
1.187
|
1.328
|
1.729
|
2.093
|
2.539
|
2.861
|
3,579
|
20th
|
0.687
|
1.185
|
1.325
|
1.725
|
2.086
|
2.528
|
2.845
|
3,552
|
21st
|
0.686
|
1.183
|
1.323
|
1.721
|
2.080
|
2.518
|
2.831
|
3.527
|
22nd
|
0.686
|
1.182
|
1.321
|
1.717
|
2.074
|
2.508
|
2.819
|
3.505
|
23
|
0.685
|
1.180
|
1,319
|
1.714
|
2.069
|
2,500
|
2.807
|
3.485
|
24
|
0.685
|
1.179
|
1.318
|
1.711
|
2.064
|
2,492
|
2.797
|
3.467
|
25th
|
0.684
|
1.178
|
1,316
|
1.708
|
2.060
|
2.485
|
2.787
|
3,450
|
26th
|
0.684
|
1.177
|
1,315
|
1.706
|
2.056
|
2,479
|
2,779
|
3.435
|
27
|
0.684
|
1.176
|
1,314
|
1.703
|
2.052
|
2.473
|
2.771
|
3.421
|
28
|
0.683
|
1.175
|
1,313
|
1.701
|
2.048
|
2,467
|
2.763
|
3.408
|
29
|
0.683
|
1.174
|
1,311
|
1,699
|
2.045
|
2.462
|
2.756
|
3.396
|
30th
|
0.683
|
1.173
|
1,310
|
1.697
|
2.042
|
2.457
|
2.750
|
3.385
|
|
40
|
0.681
|
1.167
|
1.303
|
1.684
|
2.021
|
2,423
|
2.704
|
3.307
|
50
|
0.679
|
1.164
|
1,299
|
1.676
|
2.009
|
2.403
|
2.678
|
3.261
|
60
|
0.679
|
1.162
|
1.296
|
1.671
|
2,000
|
2,390
|
2,660
|
3.232
|
70
|
0.678
|
1.160
|
1.294
|
1.667
|
1.994
|
2.381
|
2.648
|
3.211
|
80
|
0.678
|
1.159
|
1.292
|
1.664
|
1,990
|
2,374
|
2,639
|
3.195
|
90
|
0.677
|
1.158
|
1.291
|
1.662
|
1.987
|
2,368
|
2.632
|
3.183
|
100
|
0.677
|
1.157
|
1.290
|
1.660
|
1.984
|
2,364
|
2.626
|
3.174
|
|
200
|
0.676
|
1.154
|
1.286
|
1.653
|
1,972
|
2,345
|
2.601
|
3.131
|
300
|
0.675
|
1.153
|
1.284
|
1,650
|
1,968
|
2,339
|
2,592
|
3.118
|
400
|
0.675
|
1.152
|
1.284
|
1.649
|
1,966
|
2,336
|
2,588
|
3.111
|
500
|
0.675
|
1.152
|
1.283
|
1.648
|
1.965
|
2,334
|
2.586
|
3.107
|
|
|
0.674
|
1.150
|
1.282
|
1.645
|
1,960
|
2,326
|
2.576
|
3.090
|
Web links
Individual evidence
-
^ A b Student: The Probable Error of a Mean . In: Biometrika . 6, No. 1, 1908, pp. 1-25. JSTOR 2331554 . doi : 10.1093 / biomet / 6.1.1 .
-
↑ Josef Bleymüller, Günther Gehlert, Herbert Gülicher: Statistics for economists . 14th edition. Vahlen, 2004, ISBN 978-3-8006-3115-5 , pp. 16 .
-
↑ J. Pfanzagl, O. Sheynin: A forerunner of the t -Distribution (Studies in the history of probability and statistics XLIV) . In: Biometrika . 83, No. 4, 1996, pp. 891-898. doi : 10.1093 / biomet / 83.4.891 .
-
↑ P. Gorroochurn: Classic Topics on the History of Modern Mathematical Statistics from Laplace to More Recent Times . Wiley, 2016, doi : 10.1002 / 9781119127963 .
-
^ NL Johnson, BL Welch: Applications of the Non-Central t-Distribution. In: Biometrika. Vol. 31, No. 3/4 (Mar. 1940), pp. 362-389, JSTOR 2332616 doi : 10.1093 / biomet / 31.3-4.362 .
-
↑ Eric W. Weisstein : Noncentral Student's t-Distribution . In: MathWorld (English).
-
↑ HermiteH. At: functions.wolfram.com.
-
↑ Frodesen, Skjeggestad, Tofte: Probability and Statistics in Particle Physics. Universitetsforlaget, Bergen / Oslo / Tromsø, p. 141.
-
^ WT Shaw: Sampling Student's T distribution - Use of the inverse cumulative distribution function . In: Journal of Computational Finance . 9, No. 4, 2006, pp. 37-73. doi : 10.21314 / JCF.2006.150 .
Discrete univariate distributions
Continuous univariate distributions
Multivariate distributions