# Distortion of an estimator

The distortion or also the bias or systematic error of an estimation function is in estimation theory , a subfield of mathematical statistics , that characteristic number or property of an estimation function which quantifies the systematic overestimation or underestimation of the estimation function .

Unexpected estimators have, by definition, a bias of . ${\ displaystyle 0}$

## definition

Given a function to be estimated

${\ displaystyle g \ colon \ Theta \ to \ mathbb {R}}$

as well as a statistical model and a point estimator${\ displaystyle (X, {\ mathcal {A}}, (P _ {\ vartheta}) _ {\ vartheta \ in \ Theta})}$

${\ displaystyle T \ colon X \ to \ mathbb {R}}$

Then is called

${\ displaystyle \ mathbb {B} _ {T} (\ vartheta): = \ operatorname {E} _ {\ vartheta} (T) -g (\ vartheta)}$

the distortion of the estimator at . ${\ displaystyle T}$${\ displaystyle \ vartheta}$

It denotes the expected value with regard to the probability measure . This is written with and subscript to emphasize that the sizes depend on the true value . ${\ displaystyle \ operatorname {E} _ {\ vartheta}}$ ${\ displaystyle P _ {\ vartheta}}$${\ displaystyle \ vartheta}$${\ displaystyle \ mathbb {B} _ {T} (\ vartheta)}$${\ displaystyle \ operatorname {E} _ {\ vartheta} (T)}$ ${\ displaystyle \ vartheta}$

The notation for the distortion is not uniform. a. also , or . ${\ displaystyle b (\ vartheta)}$${\ displaystyle b (\ vartheta, T)}$${\ displaystyle \ operatorname {Bias} _ {\ vartheta} (T)}$

## example

Given are random numbers that are evenly distributed in an interval . Job is to estimate. Statistical model is ${\ displaystyle n}$${\ displaystyle [0, \ vartheta]}$${\ displaystyle \ vartheta}$

${\ displaystyle ([0, \ infty) ^ {n}, {\ mathcal {B}} ([0, \ infty) ^ {n}), (U _ {\ vartheta} ^ {n}) _ {\ vartheta \ in \ Theta})}$,

where and the continuous uniform distribution is on . ${\ displaystyle \ Theta = (0, \ infty)}$${\ displaystyle U _ {\ vartheta}}$${\ displaystyle [0, \ vartheta]}$

The function to be estimated is what a possible estimator would be ${\ displaystyle g (\ vartheta) = \ vartheta}$

${\ displaystyle T (X) = \ max (X_ {1}, \ dots, X_ {n})}$,

since the largest random number output is intuitively "close" to the unknown upper limit . Then ${\ displaystyle \ vartheta}$

${\ displaystyle P _ {\ vartheta} (T \ leq c) = \ left ({\ frac {c} {\ vartheta}} \ right) ^ {n}}$

for everyone . It follows ${\ displaystyle c \ in [0, \ vartheta]}$

${\ displaystyle \ operatorname {E} _ {\ vartheta} (T) = {\ frac {n} {n + 1}} \ vartheta}$,

thus the distortion

${\ displaystyle \ mathbb {B} _ {T} (\ vartheta) = {\ frac {n} {n + 1}} \ vartheta - \ vartheta = - {\ frac {\ vartheta} {n + 1}}}$.

The bias comes about here because the appraiser always underestimates the true value, it is . ${\ displaystyle P _ {\ vartheta} (T <\ vartheta) = 1}$

## properties

If the bias of an estimator is zero for all , so ${\ displaystyle \ vartheta \ in \ Theta}$

${\ displaystyle \ operatorname {E} _ {\ vartheta} (T) = g (\ vartheta) \ quad \ mathrm {f {\ ddot {u}} r \; all \;} \ vartheta \ in \ Theta}$,

so this estimator is called an unbiased estimator .

${\ displaystyle \ mathbb {F} _ {T} (\ vartheta) = \ operatorname {E} _ {\ vartheta} \ left (\ left (Tg (\ vartheta) \ right) ^ {2} \ right)}$

breaks down into variance and distortion due to the displacement law

${\ displaystyle \ mathbb {F} _ {T} (\ vartheta) = \ operatorname {Var} _ {\ vartheta} (T) + \ left (\ mathbb {B} _ {T} (\ vartheta) \ right) ^ {2}}$

Thus, the mean squared error in unbiased estimators corresponds exactly to the variance of the estimator.

Both the distortion and the mean square error are important quality criteria for point estimators . Consequently, one tries to keep both as small as possible. However, there are cases in which it makes sense to allow distortion to minimize the mean square error.

Thus, the binomial model with a uniformly most powerful unbiased estimator given by ${\ displaystyle X = \ {0, \ dots, n \}, {\ mathcal {A}} = {\ mathcal {P}} (X), P _ {\ vartheta} = \ operatorname {Bin} _ {n, \ vartheta}}$${\ displaystyle \ vartheta \ in [0,1]}$

${\ displaystyle T_ {1} (x) = {\ frac {x} {n}}}$,

means its variance (and thus also its mean squared error) is smaller for all than that of any other unbiased estimator. The appraiser ${\ displaystyle \ vartheta}$

${\ displaystyle T_ {2} = {\ frac {x + 1} {n + 2}}}$

is not fair to expectations and therefore skewed, but has a smaller mean square error for values close to . ${\ displaystyle \ vartheta}$${\ displaystyle 0 {,} 5}$

So it is not always possible to minimize the distortion and the mean square error at the same time, see also the distortion-variance-dilemma .