Goodness of fit

The goodness or goodness of fit ( English goodness of fit ) indicates "how well" can explain a lot of observations a valued model. Measures of the goodness of fit allow a statement to be made about the discrepancy between the theoretical values of the random variables examined , which are expected or forecast on the basis of the model, and the values actually measured.

The quality of the adaptation of a model to existing data can be assessed with the help of statistical tests or suitable key figures.

Adjustment measures can be used in the hypothesis test, for example, to test for normality in the residuals , to check whether two samples come from populations with the same distribution or to test whether certain frequencies follow a certain distribution (see also Pearson's Chi Square test ).

Regression analysis

Linear regression

With linear regression, there is the coefficient of determination . The coefficient of determination measures how well the measured values fit a regression model (goodness of fit). It is defined as the proportion of the " explained variation " in the " total variation " and is therefore between: ${\ displaystyle R ^ {2}}$ ${\ displaystyle R ^ {2} = 1-SQR / SQT}$

${\ displaystyle 0 \, \%}$ (or ): no linear relationship and ${\ displaystyle 0}$
${\ displaystyle 100 \, \%}$ (or ): perfect linear relationship. ${\ displaystyle 1}$

The closer the coefficient of determination is to the value one, the higher the “specificity” or “quality” of the adjustment. Is , then the “ best ” linear regression model consists only of the intercept while is. The closer the value of the coefficient of determination is, the better the regression line explains the true model . If , then the dependent variable can be fully explained by the linear regression model. The measurement points then clearly lie on the non-horizontal regression line. In this case, there is no stochastic relationship, but a deterministic one. ${\ displaystyle R ^ {2} = 0}$ ${\ displaystyle {\ hat {\ beta}} _ {0}}$ ${\ displaystyle {\ hat {\ beta}} _ {1} = 0}$ ${\ displaystyle 1}$ ${\ displaystyle R ^ {2} = 1}$ ${\ displaystyle Y}$ ${\ displaystyle (x_ {1}, y_ {1}), \ ldots, (x_ {n}, y_ {n})}$

Adaptation tests

A fit test ( English goodness-of-fit test ) is in the inferential statistics , a nonparametric hypothesis test , the unknown probability distribution of a random variable on (approximate) consequences of a particular distribution model (eg. As often the normal distribution is to check). It is about the hypothesis that a given sample comes from a distribution with a certain distribution function . This is often realized through asymptotic considerations of the empirical distribution function (see also Glivenko-Cantelli theorem ). Well-known adaptation tests are for example:

example

When Pearson chi-square test is the chi-square statistic, known as chi-square sum ( english goodness of fit statistic ) the total divided by the expected frequencies squared differences between the observed and expected frequencies:

{\ displaystyle X ^ {2} = \ sum _ {i = 1} ^ {n} {\ frac {(O_ {i} -E_ {i}) ^ {2}} {E_ {i}}} = N \ sum _ {i = 1} ^ {n} {\ frac {\ left (O_ {i} / N-p_ {i} \ right) ^ {2}} {p_ {i}}}}

{\ displaystyle O_ {i}}

= Number of observations of type

{\ displaystyle i}

{\ displaystyle N}

= Total number of observations

{\ displaystyle E_ {i} = Np_ {i}}

= Expected frequency of type

{\ displaystyle i}

{\ displaystyle n}

= Number of cells in the table

The result can be compared to the chi-square distribution to determine the goodness of fit.

Quality criteria

Various quality criteria have been established in structural equation models:

Chi-square value

Goodness of fit index (Engl. Goodness-of-fit index , GFI )

adjusted goodness of fit index (Engl. adjusted goodness-of-fit index , AGFI )

comparative adaptation index (Engl. comparative fit index , CFI )

normalized adaptation index (Engl. normed fit index , NFI )

Approximation discrepancy root ( root mean square error of approximation , RMSEA )

Standardized Residualdiskrepanzwurzel (Engl. Standardized root mean square residual , SRMR )

Individual evidence

↑ Bernd Rönz, Hans G. Strohe (1994), Lexicon Statistics , Gabler Verlag
↑ Lothar Sachs , Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 470

[Roenz1994-1] Bernd Rönz, Hans G. Strohe (1994), Lexicon Statistics , Gabler Verlag

[2] Lothar Sachs , Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 470