Expectancy
Unbiasedness (often unbiasedness , English unbiasedness ) referred to in the mathematical statistics a property of estimator (short: an estimator). An estimator is called fair if its expected value is equal to the true value of the parameter to be estimated . If an estimator is not expected to be true, the estimator is said to be biased . The extent to which its expected value deviates from the true value is called distortion or bias. The bias expresses the systematic error of the estimator.
In addition to consistency , sufficiency and (asymptotic) efficiency , faithfulness to expectations is one of the four common criteria for assessing the quality of estimators. Furthermore, together with sufficiency and invariance / equivariance, it belongs to the typical reduction principles of mathematical statistics.
meaning
Faithfulness to expectation is an important property of an estimator because the variance of most estimators converges to zero as the sample size increases. In other words, the distribution is drawn around the expected value of the estimator, and thus in the case of true-to-expectation estimators by the desired true parameter of the population . With unbiased estimates, we can expect that the larger the sample size, the smaller the difference between the estimated value calculated from the sample and the true parameter.
In addition to the practical assessment of the quality of estimators, the concept of faithfulness to expectations is also of great importance for mathematical estimation theory. In the class of all unbiased estimators, it is possible to prove the existence and uniqueness of the best estimator - under suitable conditions for the underlying distribution model . These are unbiased estimators that have minimal variance among all possible unbiased estimators.
Basic idea and introductory examples
By an unknown real parameter to estimate a population, is calculated in mathematical statistics from a random sample by using a suitably selected function , an estimate . In general, suitable estimation functions can be determined using estimation methods , e.g. B. the maximum likelihood method win.
Since the sample variables are random variables , the estimator itself is also a random variable. It is called unbiased if the expected value of this random variable is always the same as the parameter , regardless of the actual value .
Example sample means
The sample mean is usually used to estimate the expected value of the population
used. If all sample variables are drawn randomly from the population, then all have the expected value . The expected value of the sample mean is thus calculated
- .
The sample mean is therefore an unbiased estimator of the unknown distribution parameter .
If the population is normally distributed with expected value and variance , then the distribution of can be specified exactly. In this case
that is, the sample mean is also normally distributed with expectation and variance . If the sample size is large, this distribution statement is at least approximately valid due to the central limit theorem , even if the population is not normally distributed. So the variance of this estimator converges to 0 when the sample size approaches infinity. The graphic on the right shows how the distribution of the sample means contracts to a fixed value for different sample sizes. The fact that this is true to expectations ensures that this value is the parameter being sought .
Example relative frequency
In order to estimate the probability with which a certain characteristic occurs in the population, a sample of size is selected at random and the absolute frequency of the characteristic in the sample is counted. The random variable is then binomially distributed with the parameters and , in particular, it applies to its expected value . For the relative frequency
then follows, that is, it is an unbiased estimator of the unknown probability .
definition
In the modern, maßtheoretisch established mathematical statistics a statistical experiment is a statistical model described. This consists of a set , the sample space, along with a σ-algebra and a family of probability measures on .
It is a point estimator
as well as a function
given (in the parametric case the so-called parameter function ), which assigns the key figure to be estimated (variance, median, expected value, etc.) to each probability distribution .
Then the estimator is called fair if
- is. Here denotes the expected value with regard to the probability measure .
In applications there is often the distribution of a (real or vector valued) random variable on a probability space with an unknown parameter or parameter vector . A point estimator for in the above sense then results in a function and this is called an unbiased estimator if the following applies
where the expected value is now formed with respect to .
properties
existence
Unexpected estimators generally do not have to exist. The choice of function is essential for this . If the function to be estimated is not chosen appropriately, the set of unbiased estimators can be small, have nonsensical properties or be empty.
In the binomial model
are for example, only polynomials in less than or equal of degree n is unbiased estimated. For functions to be appreciated that are not of the form
So there is no unbiased estimator.
Even if an unbiased estimator exists, it does not have to be a practically meaningful estimator: for example in the Poisson model
and when using the function to be estimated
results as the only unbiased estimator
- .
Obviously, this estimator is pointless. It should be noted here that the choice of the function to be estimated is not exotic: It estimates the probability that no event will occur three times in a row (with independent repetition).
structure
A fixed statistical model is given. Let be the set of unbiased estimators for the function to be estimated and the set of all zero estimators , so
- .
If you now select one , it is
- .
The set of all unbiased estimators for thus arise from an unbiased estimator for in combination with the zero estimators.
Relationship to Distortion and MQF
Unexpected estimators, by definition, have a bias of zero:
- .
This reduces the mean square error to the variance of the estimator:
- .
Optimality
Faithfulness to expectations is in itself a quality criterion, since unambiguous estimators always have a distortion of zero and thus on average provide the value to be estimated. So you have no systematic error. In the set of unbiased estimators, the central quality criterion for estimators, the mean square error , is reduced to the variance of the estimators. Accordingly, the two common optimality criteria compare the variances of point estimates.
- Locally minimal estimators compare the variances of point estimates for a given one . An estimator is then called a locally minimal estimator in if
- holds for all further unbiased estimators .
- Uniformly best unambiguous estimators tighten this requirement to the effect that one estimator should have a smaller variance for all than any other unambiguous estimator. It then applies
- and all unbiased estimators .
Expectation vs. mean square error
Unexpected estimators can be viewed as "good" in two ways:
- On the one hand, their distortion is always zero; accordingly they have the desirable property of not showing any systematic error.
- On the other hand, due to the decomposition of the mean squared error into distortion and variance, the mean squared error of an unbiased estimator is always automatically small, since the distortion does not apply.
However, it is not always possible to achieve both goals (unambiguousness and minimum square error) at the same time. Thus, the binomial model with a uniformly most powerful unbiased estimator given by
- .
The appraiser
is not fair to expectations and therefore skewed, but has a smaller mean square error for values close to .
So it is not always possible to minimize distortion and mean square error at the same time.
Estimator with bias
It follows from the definition that “good” estimators are at least approximately faithful to expectations, that is, they should be characterized by the fact that they are on average close to the value to be estimated. Usually, however, expectancy is not the only important criterion for the quality of an estimator; for example, it should also have a small variance, i.e. fluctuate as little as possible around the value to be estimated. In summary, the classic criterion of a minimum mean square deviation results for optimal estimators.
The bias of an estimator is defined as the difference between its expected value and the quantity to be estimated:
Its mean square error is
The mean square error is equal to the sum of the square of the distortion and the variance of the estimator:
In practice, distortion can have two causes:
- a systematic error , for example a non-random measurement error in the apparatus, or
- a random error whose expected value is not equal .
Random errors can be tolerable if they contribute to the estimator having a smaller minimum squared deviation than an undistorted one.
Asymptotic expectancy
As a rule, it does not matter that an estimator is unbiased. Most of the results of mathematical statistics are only valid asymptotically , i.e. when the sample size increases to infinity. It is therefore usually sufficient if the limit value is true to expectations, ie the convergence statement applies to a sequence of estimators .
Another example: sample variance in the normal distribution model
A typical example are estimators for the parameters of normal distributions . In this case, consider the parametric family
- with and ,
where is the normal distribution with expectation and variance . Usually observations are given that are stochastically independent and each have the distribution .
As already seen, the sample mean is an unbiased estimator of .
The maximum likelihood estimator for the variance is obtained . However, this estimator is not faithful to expectations, as it can be shown (see sample variance (estimator) #of expectations ). So the distortion is . Since this vanishes asymptotically, i.e. for , the estimator is, however, asymptotically true to expectations.
In addition, in this case the expected value of the distortion can be specified precisely and the distortion can be corrected by multiplying by (so-called Bessel correction ), and thus an estimator for the variance is obtained that is true to expectations even for small samples.
In general, however, it is not possible to exactly determine the expected distortion and thus to correct it completely. However, there are methods to at least reduce the distortion of an asymptotically unbiased estimator for finite samples, for example the so-called jackknife method .
Constructive terms
An unbiased estimator is called a regular unbiased estimator if
applies. here denotes the density function for the parameter . Differentiation and integration should therefore be interchangeable. Regular unbiased estimators play an important role in the Cramér-Rao inequality .
Generalizations
A generalization of the fidelity to expectations is the L-authenticity , it generalizes the fidelity to expectations by means of more general loss functions . When using the Gaussian loss , one obtains the faithfulness to expectations as a special case, when using the Laplace loss, the median authenticity is obtained .
literature
- Hans-Otto Georgii: Stochastics: Introduction to probability theory and statistics. de Gruyter textbook 2004, ISBN 3-11-018282-3 .
- Herrmann Witting: Mathematical Statistics, Vol. 1. Parametric methods with a fixed sample size. Vieweg + Teubner, Stuttgart 1985, ISBN 978-3-519-02026-4 .
- M. Hardy: "An Illuminating Counterexample" (PDF; 63 kB)
- Ludger Rüschendorf: Mathematical Statistics . Springer Verlag, Berlin Heidelberg 2014, ISBN 978-3-642-41996-6 , doi : 10.1007 / 978-3-642-41997-3 .
- Claudia Czado, Thorsten Schmidt: Mathematical Statistics . Springer-Verlag, Berlin Heidelberg 2011, ISBN 978-3-642-17260-1 , doi : 10.1007 / 978-3-642-17261-8 .
Individual evidence
- ↑ Bernd Rönz, Hans G. Strohe (1994), Lexicon Statistics , Gabler Verlag, pp. 110, 363
- ↑ Horst Rinne: Pocket book of statistics . 3. Edition. Verlag Harri Deutsch, 2003, p. 435 .
- ↑ Kauermann, G. and Küchenhoff, H .: sample: Methods and Practical Implementation With R . Springer, 2011, ISBN 978-3-642-12318-4 , pp. 21 . Google Books
- ^ Rüschendorf: Mathematical Statistics. 2014, p. 126.
- ^ Georgii: Stochastics. 2009, p. 209.