Over-dispersion

from Wikipedia, the free encyclopedia

In statistics  , when modeling counting data , there is overdispersion , also known as over- scattering ( Latin dispersio = "scattering" or dispergere = "distribute, spread out, scatter") if the data show a greater scatter ( variance ) than the model ( e.g. binomial model or Poisson model) would be expected.

In a Poisson regression , if overdispersion is present, the variance is greater than the expected value . The counterpart of the dispersion over the lower dispersion . Undersispersion occurs when the data has less variation than the model assumes.

causes

Essentially the main causes of overdispersion are the presence of

Theoretical examples

Using the example of a Poisson regression

A characteristic of the Poisson distribution is that the expectation and the variance are at the same time :

.

For reasons similar to binomial data, a significantly higher empirical variance is often observed when using Poisson regression. For this reason it is often useful to introduce an over- dispersion parameter on the assumption that

.

Estimated over-dispersion parameter κ in selected infectious diseases

The over-dispersion parameter κ (incorrectly also referred to as dispersion parameter or dispersion factor and with instead of ) indicates in epidemiology how the spread is increasingly concentrated on individual organisms that are already infected with a pathogen ; the basic reproduction number enables statements about the distribution of κ . A value of put all infected organisms, the same number of uninfected. With values ​​of , the infection with an infectious disease occurs more and more through so-called super spreaders .

Degree of dispersion of selected infectious diseases
Infectious disease Estimated over-dispersion parameter κ
Seasonal flu 1
SARS pandemic 2002/2003 0.16
MERS 0.25
Spanish flu 1

Remarks

  1. In statistics, Greek letters are conventionally used for population sizes and Latin letters for sample sizes .

Individual evidence

  1. ^ Ludwig Fahrmeir , Thomas Kneib , Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 294.
  2. How super spreaders are affecting the pandemic . Accessed May 31, 2020.