Deviance (statistics)

In statistics , the deviance ( deviation from the ideal value ) is a central measure for evaluating the goodness of fit of estimates in the linear model and is often used when testing hypotheses . It is a generalization of the concept of the residual sum of squares in KQ regression for cases in which the model is fitted by a maximum likelihood estimate . In connection with the modeling, the deviance corresponds to the sum of the squared deviations (SQ) in linear regression models . It plays a big role in generalized linear models

Origin of the term

The term deviance has its origins in sociology and describes the deviation (from French dévier , German deviate ) from general norms and values.

definition

The deviance is a statistic that is used to indicate how much the fit of a model currently under consideration deviates from the model that delivers a perfect fit to the data (so-called saturated model ). The saturated model allows different regression parameters for each individual. The deviance is given by

{\ displaystyle D = -2 \ left \ {\ log {\ hat {\ mathcal {L}}} _ {a} - \ log {\ hat {\ mathcal {L}}} _ {g} \ right \} }

,

where the maximized partial likelihood function (also called plausibility function) is under the current model and the maximized partial likelihood function is under the saturated model (model in which as many parameters as observation pairs occur). Using the Logarithmengesetze the deviance can also use a likelihood ratio or plausibility quotient express ${\ displaystyle {\ hat {\ mathcal {L}}} _ {a}}$ ${\ displaystyle {\ hat {\ mathcal {L}}} _ {g}}$

{\ displaystyle D = -2 \ log \ left ({\ frac {{\ hat {\ mathcal {L}}} _ {a}} {{\ hat {\ mathcal {L}}} _ {g}}} \ right)}

.

The prefactor is necessary to obtain a quantity that has a known distribution and can therefore be used for hypothesis tests. The smaller the value of the deviance , the better the model. For the saturated model, the deviance is zero. The deviance can be understood as a generalization of the residual sum of squares used for normally distributed data (see Classical linear model of normal regression ) to the analysis of non- normally distributed data in generalized linear models . Note that a difference in deviance between two alternative models equals the difference in the value of the statistic . ${\ displaystyle -2}$ ${\ displaystyle D}$ ${\ displaystyle -2 \ log {\ hat {\ mathcal {L}}}}$

Individual evidence

↑ Lothar Sachs , Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 834
↑ Lothar Sachs, Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 834
↑ David Collett : Modeling survival data in medical research . Chapman and Hall / CRC, 2015. pp. 154 ff.

[1] Lothar Sachs , Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 834

[2] Lothar Sachs, Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 834

[3] David Collett : Modeling survival data in medical research . Chapman and Hall / CRC, 2015. pp. 154 ff.

Deviance (statistics)

contents

Origin of the term

definition

See also

Individual evidence