Generalized linear models

from Wikipedia, the free encyclopedia

Generalized linear models ( VLM ), also generalized linear models ( GLM or GLiM ), are an important class of non-linear models introduced by John Nelder and Robert Wedderburn (1972) in statistics , which represent a generalization of the classic linear regression model in regression analysis . While one assumes in classical linear models that the disturbance variable (the unobservable random component) is normally distributed , in GLMs it can have a distribution from the class of the exponential family . In addition to the normal distribution , this distribution class also includes the binomial , Poisson , Gamma and inverse Gaussian distributions . The use of the exponential family in generalized linear models thus offers a uniform framework for these distributions. The large class of vektorverallgemeinerten linear models ( English vector generalized linear models , short VGLMs ) includes the class of generalized linear models as a special case. Also included in this large class of models are loglinear models for categorical data and the Poisson regression model for count data . In order to overcome the limitations of the generalized linear models and generalized additive models , so-called generalized additive models were developed for position, scale and shape parameters .

Disambiguation

Generalized linear models are not to be confused with the general linear model , whose natural English abbreviation is also GLM , but, in contrast to generalized linear models, assumes a normally distributed response variable. In many statistical program packages - since the abbreviation GLM is already used for the general linear model - other abbreviations such as VLM or GLZ for English GeneraLiZed linear models (in STATISTICA ) or GzLM for English GeneraLiZed Linear Models (in SPSS ) are used to better distinguish them . Some authors use the abbreviation GLiM instead of the abbreviation GLM for better differentiation .

Likewise, generalized linear models are not to be confused with the generalized linear regression model of the generalized least squares estimation ( VKQ estimation ), which however has a generalized structure with regard to the disturbance variables .

Model components

The model class of generalized linear models consists of three components:

. Here you can see that the linear predictor introduces the vector of the regression coefficients into the model.
  • Coupling function : For a generalized linear model, a (often non-linear) is coupling function exist, the passing through the linear predictor systematic component and described by the expected value of the response variable stochastic component of the described distribution of coupled: . The inverse function of the coupling function, called the response function transforms the linear combination of the explanatory variables in the (conditional) expected value : .

Distributions from the family of generalized linear models

The normal distribution , binomial distribution , Poisson distribution , gamma distribution and the inverse normal distribution , Bernoulli distribution , scaled Poisson distribution , scaled binomial distribution , scaled negative binomial distribution can be embedded in the model class of the generalized linear models .

Exponential family

The distribution of a response variable belongs to the one-dimensional exponential family if the density function or probability function can be written in the following form:

.

Here are:

  • the observation values ​​of the response variable (known)
  • the specified weights (known)
  • a pre - specified twofold differentiable function (known)
  • the real-valued distribution parameter of density ; the so-called canonical (natural) parameter (unknown)
  • a scale parameter that is independent of the expected value (also known as the variance parameter ) and that is relevant for the variance (known)
  • and a suitable function for normalizing the density ( normalization constant ) and which does not depend on (known)

For the function it is necessary that it can be normalized and that the first and second derivative exist. The second derivative determined in addition to the scale parameter , the variance of the distribution, and therefore as a variance function referred to. For all distributions of the exponential family:

The parameter is not primarily of interest and is therefore regarded as a disturbance parameter . Examples of distributions belonging to the exponential family:

distribution
Canonical parameter
Scale parameters
pre-specified function
pre-specified function
Normalization constant
Probability function
Normal distribution
Bernoulli distribution
With
Binomial distribution
With
Poisson distribution
With

literature

  • John Nelder, Peter McCullagh: Generalized Linear Models , Chapman and Hall / CRC Press, 2nd edition 1989

Individual evidence

  1. generalized linear model. Glossary of statistical terms. In: International Statistical Institute . June 1, 2011, accessed July 4, 2020 .
  2. ^ John Nelder, Robert Wedderburn: Generalized Linear Models . In: Journal of the Royal Statistical Society, Series A (General) . 135, 1972, pp. 370-384. doi : 10.2307 / 2344614 .
  3. ^ Rencher, Alvin C., and G. Bruce Schaalje: Linear models in statistics. , John Wiley & Sons, 2008., p. 513.
  4. ^ Rencher, Alvin C., and G. Bruce Schaalje: Linear models in statistics. , John Wiley & Sons, 2008., p. 514.
  5. ^ Ludwig Fahrmeir, Thomas Kneib, Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 301.
  6. Torsten Becker, et al .: Stochastic risk modeling and statistical methods. Springer Spectrum, 2016. p. 308.
  7. ^ Ludwig Fahrmeir, Thomas Kneib, Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 301.
  8. ^ Ludwig Fahrmeir , Thomas Kneib , Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 302.