Identifiability

from Wikipedia, the free encyclopedia

In statistics, and in econometrics in particular, the identifiability of a model is the property of estimation models that inferential statistics can be applied to them.

A model is identifiable if it is theoretically possible to determine the true values on which the model is based by making (drawing) an infinite number of observations. Mathematically, this means that different values ​​of the parameters of the model produce different probability functions of the observable variables.

In practice, where there are a finite number of observations, the identifiability of a model is limited by the number of parameters to be estimated , the number of observations and the number of associated degrees of freedom .

History of the term

The term identifiability was coined by the econometrician Tjalling Koopmans around 1945 with reference to the economic identity of a relationship within a relationship system. The term immediately appeared in the econometrics literature, although Koopman's own account of the topic - his "identification problems in economic model making" - did not appear until 1949. Around 1950 the term was picked up by statisticians and used in a more general sense, see e.g. B. Jerzy Neyman's Existence of Consistent Estimates of the Directional Parameter in a Linear Structural Relation Between Two Variables .

definition

Let be a statistical model with a (possibly infinite-dimensional) parameter space . Then it means identifiable if the mapping is injective . So it should apply:

.

Different values ​​of should therefore correspond to different probability distributions .

If the distributions are defined via probability density functions, then these are considered to be different if they differ on a set of positive Lebesgue measures . (For example, two functions that differ in only one point are not considered different probability density functions in this sense.)

This identifiability of the model in the sense of the invertibility of is equivalent to the fact that the true parameters of the model can be determined if one can observe the model indefinitely. For if is the consequence of observations, then it follows from the strong law of large numbers

for each measurable quantity , where the indicator function denotes a quantity. With an infinite number of observations, one can therefore determine the true probability distribution and, because the mapping is invertible, also the true value of the parameter .

Examples

Normal distributions

Be the family of normal distributions , a location-scale family forms

.

Then

.

This expression is zero almost everywhere if and only if all of its parameters are zero, which is only possible for and . Because of the scale parameter is positive, the model can be identified: .

Multiple linear regression model

Let that be the classic model of linear multiple regression , with the vector of the unknown regression parameters , the experiment plan matrix , the vector of the dependent variables and the vector of the disturbance variables . Then the parameter can be identified exactly when the matrix is invertible .

Classic error-in-variables model

Be the classic error-in-the-variables model

where jointly normally distributed independent random variables with expected value zero and unknown variance are and only the variables are observed.

This model cannot be identified. However, the product (where the variance is the latent regressor ) is identifiable.

In this example, the exact value of cannot be identified, but it can be guaranteed that it must lie in the interval , where and are the coefficients obtained from on and on by means of an ordinary least-squares estimate .

literature

  • Hans-Friedrich Eckey, Reinhold Kosfeld, Christian Dreger: Econometrics: Basics, Methods, Examples . Gabler Verlag, 2004, ISBN 978-3-409-33732-8 , pp. 321 ( limited preview in Google Book search).

Individual evidence

  1. ^ Earliest Known Uses of Some of the Words of Mathematics: Identifiability