The Fisher information (named after the statistician Ronald Fisher ) is a parameter from the mathematical statistics , which for a family of probability densities can be defined and statements about the best possible quality of parameter estimates provides in this model.
definition
A one-parameter statistical standard model is given , that is,
![{\ displaystyle (X, {\ mathcal {A}}, (P _ {\ vartheta}) _ {\ vartheta \ in \ Theta})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f1ae90024d15ae506858bf3720aedc1baf480fbe)
- it is ,
![\ Theta \ subset \ mathbb {R}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9bf2f82825ab54628ef205b5f3a2bc05ca299515)
- The all have a density function relative to a fixed σ-finite measure , that is, they form a dominant distribution class .
![{\ displaystyle P _ {\ vartheta}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4a035875f3f7766b93af51105313b364f91a9c4b)
![\ mu](https://wikimedia.org/api/rest_v1/media/math/render/svg/9fd47b2a39f7a7856952afec1f1db72c67af6161)
Furthermore, there is an open set and the score function exists
![{\ displaystyle S _ {\ vartheta} (x): = {\ frac {\ partial} {\ partial \ vartheta}} \ ln f (x, \ vartheta) = {\ frac {{\ frac {\ partial} {\ partial \ vartheta}} f (x, \ vartheta)} {f (x, \ vartheta)}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0430367db457963a2e01f8b1340ed83df955bb76)
and be finite. Then the Fisher information of the model is defined as either
![{\ displaystyle I (\ vartheta): = \ operatorname {Var} _ {\ vartheta} (S _ {\ vartheta})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2f5699eb5a6ed4514ae040930741e1d6cc65368b)
or as
-
.
The variance refers to the probability distribution . Under the regularity condition
![{\ displaystyle \ int {\ frac {\ partial} {\ partial \ vartheta}} \, f (x, \ vartheta) \, \ mathrm {d} \ mu (x) = {\ frac {\ partial} {\ partial \ vartheta}} \ int f (x, \ vartheta) \, \ mathrm {d} \ mu (x)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8818ac945f049923ae0f34f2ab15efac31e45b0e)
the two definitions coincide. The regularity condition also applies
-
,
so the Fisher information is given by
-
.
Comments on the definition
The following things should be considered when defining:
- It does not follow from the fact that the model is one-parametric that it is a question of probability distributions over a one-dimensional basic space. One-parameter only means that the distributions are determined by a one-dimensional parameter. No requirements are placed on the dimensions of the floor space.
- In most cases the measure with respect to which the density functions are defined is either the Lebesgue measure or the counting measure . In the case of the counting measure, the density functions are probability functions ; the integral is accordingly replaced by a sum. If the Lebesgue measure is involved, the integral is a Lebesgue integral , but in most cases it can be replaced by the traditional Riemann integral . You then write accordingly instead of .
![\ lambda](https://wikimedia.org/api/rest_v1/media/math/render/svg/b43d0ea3c9c025af1be9128e62a18fa74bedda2a)
![\ mathrm {d} x](https://wikimedia.org/api/rest_v1/media/math/render/svg/0b637d7a6d64d797391d40ceae9e8696e5d76f15)
![{\ displaystyle \ mathrm {d} \ lambda (x)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d8fd9378d7f84fa8fbadcb433b5b0913aa21efbc)
- Sufficient for the existence of the score function is, for example, that on is completely positive and continuously differentiable according to .
![{\ displaystyle f (x, \ vartheta)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/695f7b48112587c8556c2d81f89eda8800762df6)
![{\ displaystyle X \ times \ Theta}](https://wikimedia.org/api/rest_v1/media/math/render/svg/95fac9ce93a9be1df83099d806f08503af0e5342)
![\ vartheta](https://wikimedia.org/api/rest_v1/media/math/render/svg/d00eaf197c35bbfa391b9477490a4af955416837)
- The first regularity condition applies, for example, by definition in regular statistical models . Mostly one shows the interchangeability of integration and differentiation with the classical statements of analysis.
- Under the first regularity condition, the score function is centered, that is, it is . The equivalence of the first two definitions of Fisher information follows from this using the variance shift theorem.
![{\ displaystyle \ operatorname {E} _ {\ vartheta} (S _ {\ vartheta}) = 0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/26ae8801d4c33f75d053699ae15aed8e31e1fc3a)
Examples
Discrete ground space: Poisson distribution
The basic space is given as a statistical model , provided with the σ-algebra , the power set . For is the Poisson distribution . Accordingly, the density function, here with regard to the counting measure, is given by
![{\ displaystyle {\ mathcal {A}} = {\ mathcal {P}} (X)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/dcef2710e1e1e10aaa3462e555b6e40085998dba)
![{\ displaystyle \ lambda \ in (0, \ infty)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/66704521911e7bce5122e668683005f6d22207ac)
![P _ {\ lambda}](https://wikimedia.org/api/rest_v1/media/math/render/svg/330591f9b6fffc93ca78514576fd0d8cfac6f0c7)
-
.
This results in the score function too
![{\ displaystyle S _ {\ lambda} (x) = {\ frac {\ partial} {\ partial \ lambda}} \ ln f (x, \ lambda) = {\ frac {\ partial} {\ partial \ lambda}} \ left (x \ ln (\ lambda) - \ ln (x!) - \ lambda \ right) = {\ frac {x} {\ lambda}} - 1}](https://wikimedia.org/api/rest_v1/media/math/render/svg/dfca38f56e2be9bc914b105fc0895a2230ae2922)
The Fisher information is thus according to the calculation rules for the variance under linear transformations
-
.
Continuous base space: exponential distribution
This time and is chosen as the statistical model . The are exponentially distributed with parameter . Thus they have the density function (with respect to the Lebesgue measure)
![{\ displaystyle X = (0, \ infty)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/89634c8b38f1f28b19a04ec4cec62d7566447632)
![{\ displaystyle {\ mathcal {A}} = {\ mathcal {B}} ((0, \ infty))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/02feb3377c0957edf430b115ef4c0767a83d72df)
![P _ {\ lambda}](https://wikimedia.org/api/rest_v1/media/math/render/svg/330591f9b6fffc93ca78514576fd0d8cfac6f0c7)
![{\ displaystyle \ lambda \ in (0, \ infty)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/66704521911e7bce5122e668683005f6d22207ac)
-
.
Hence the score function
-
,
hence the Fisher information
![{\ displaystyle I (\ lambda) = \ operatorname {Var} _ {\ lambda} (S _ {\ lambda}) = {\ frac {1} {\ lambda ^ {2}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e26385d5f64f11706d8108177ee16ea2084c5a92)
Fisher information of an exponential family
Is given by a one-parameter exponential family , so it has the density function
![{\ displaystyle P _ {\ vartheta}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4a035875f3f7766b93af51105313b364f91a9c4b)
-
,
so the score function is given by
-
.
From this it follows for the Fisher information
-
.
If the exponential family is given in the natural parameterization , then this simplifies to
![{\ displaystyle \ eta (\ vartheta) = \ vartheta}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4fd2b1d45740025d0d352db9b0a17b73d90a6254)
![{\ displaystyle S _ {\ vartheta} (x) = T (x) + {\ frac {A '(\ vartheta)} {A (\ vartheta)}} {\ text {and}} I (\ vartheta) = \ operatorname {Var} _ {\ vartheta} (T (x))}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5e6b0ea666b27fc341d097b67d61468ab21d6067)
In this case, the variance of the canonical statistic is Fisher information.
![T](https://wikimedia.org/api/rest_v1/media/math/render/svg/ec7200acd984a1d3a3d7dc455e262fbe54f7f6e0)
Properties and uses
Additivity
The Fisher information in the case of independent and identically distributed random variables additive under the first regularity, that is, for the Fisher information of a sample of independent and identically distributed random variables with the Fisher information
applies
![{\ mathcal {I}} ^ {{(n)}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/fad33c0345a0ee2f8076177d53f4db0fe5ec349b)
![X_ {1}, \ dotsc, X_ {n}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ae25602d4b1df2584ef4dd94474b2db32c4e970e)
![{\ mathcal {I}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0e9730a0ada0426927ff64141eb9f505eca132d4)
-
.
This property follows directly from Bienaymé's equation .
Sufficiency
Furthermore, for sufficient statistics , the Fisher information regarding is the same as for , where applies.
![T](https://wikimedia.org/api/rest_v1/media/math/render/svg/ec7200acd984a1d3a3d7dc455e262fbe54f7f6e0)
![f _ {{\ vartheta}} (X)](https://wikimedia.org/api/rest_v1/media/math/render/svg/2f9d677bcbf93ae09476cf8160b646f7e34c01e6)
![g _ {{\ vartheta}} (T (X))](https://wikimedia.org/api/rest_v1/media/math/render/svg/046d46670c5c2061a75b0c1631b8c72982aad5b2)
![f _ {{\ vartheta}} (x) = h (x) g _ {{\ vartheta}} (T (x))](https://wikimedia.org/api/rest_v1/media/math/render/svg/aa9f0995640b630ccacb7c2448e5977b059faeaf)
use
The Fisher information is used specifically in the Cramér-Rao inequality , where its reciprocal value provides a lower bound for the variance of an estimator if the mentioned regularity condition is valid : If
an unbiased estimator for the unknown parameter then applies .
![\ vartheta](https://wikimedia.org/api/rest_v1/media/math/render/svg/d00eaf197c35bbfa391b9477490a4af955416837)
![T (X)](https://wikimedia.org/api/rest_v1/media/math/render/svg/fe67aad4eff628fcb5bb28ee6a2213d28ff12e7e)
![\ vartheta](https://wikimedia.org/api/rest_v1/media/math/render/svg/d00eaf197c35bbfa391b9477490a4af955416837)
![\ operatorname {Var} _ {{\ vartheta}} (T (X)) \ geq {\ mathcal {I}} (\ vartheta) ^ {{- 1}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c64fdc8f852450ec6dcddd7eb6b7841cf3ca400e)
Extensions to higher dimensions
If the model of multiple parameters with dependent, the Fisher information than can be symmetric matrix defined, wherein
![\ vartheta _ {{i}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/680c0842fc17876e7c0e13bab629f4a0f2aaaa2c)
![{\ mathcal {I}} (\ vartheta) = ({\ mathcal {I}} _ {{ij}} (\ vartheta)) _ {{i, j = 1, \ dotsc, k}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8cec718ba285b708b5f26d435ea015460686a2f4)
![{\ displaystyle {\ mathcal {I}} _ {ij} (\ vartheta) = \ operatorname {E} _ {\ vartheta} \ left [{\ frac {\ partial} {\ partial \ vartheta _ {i}}} \ log f _ {\ vartheta} (X) \ cdot {\ frac {\ partial} {\ partial \ vartheta _ {j}}} \ log f _ {\ vartheta} (X) \ right]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/aff3bf7f39b029182e04937344561f9693ab0113)
applies. It is called the Fisher information matrix. The properties are essentially retained. Under the regularity is the covariance matrix of the score function.
![{\ mathcal {I}} (\ vartheta)](https://wikimedia.org/api/rest_v1/media/math/render/svg/fe59196c37090d8ff397fbc8715a4d5bd0de083f)
Example: normal distribution
If is normally distributed with expected value as parameter and known variance , then is . It follows
![\ vartheta](https://wikimedia.org/api/rest_v1/media/math/render/svg/d00eaf197c35bbfa391b9477490a4af955416837)
![v> 0](https://wikimedia.org/api/rest_v1/media/math/render/svg/c314fc908a83c555d34968d25e86c5ae0b76ef6f)
![f _ {{\ vartheta}} (x) = {\ frac {1} {{\ sqrt {2 \ pi v}}}} {\ mathrm {e}} ^ {{- {\ frac {(x- \ vartheta ) ^ {2}} {2v}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ea49269769ee7d0d646b3ba3f701e1964fcb52e5)
-
,
so
-
.
If, on the other hand, one considers both the expected value and the variance as unknown parameters, the result is
![\ mu](https://wikimedia.org/api/rest_v1/media/math/render/svg/9fd47b2a39f7a7856952afec1f1db72c67af6161)
![v](https://wikimedia.org/api/rest_v1/media/math/render/svg/e07b00e7fc0847fbd16391c778d65bc25c452597)
![{\ mathcal {I}} (\ mu, v) = {\ begin {pmatrix} {\ dfrac {1} {v}} & 0 \\ 0 & {\ dfrac {1} {2v ^ {2}}} \ end {pmatrix}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7d32c08f3090be90eefceff4da419431c7d891e7)
as a Fisher information matrix.
literature
-
Hans-Otto Georgii : Stochastics . Introduction to probability theory and statistics. 4th edition. Walter de Gruyter, Berlin 2009, ISBN 978-3-11-021526-7 , doi : 10.1515 / 9783110215274 .
- Ludger Rüschendorf: Mathematical Statistics . Springer Verlag, Berlin Heidelberg 2014, ISBN 978-3-642-41996-6 , doi : 10.1007 / 978-3-642-41997-3 .
- Claudia Czado, Thorsten Schmidt: Mathematical Statistics . Springer-Verlag, Berlin Heidelberg 2011, ISBN 978-3-642-17260-1 , doi : 10.1007 / 978-3-642-17261-8 .
- Helmut Pruscha: Lectures on mathematical statistics. BG Teubner, Stuttgart 2000, ISBN 3-519-02393-8 , Section V.1.
Individual evidence
-
^ Georgii: Stochastics. 2009, p. 210.
-
↑ Czado Schmidt: Mathematical Statistics. 2011, p. 116.
Special matrices in statistics