Post hoc test

from Wikipedia, the free encyclopedia

Post-hoc tests are significance tests from mathematical statistics . The simple analysis of variance , the Kruskal-Wallis test, or the median test only determines that there are significant differences in a group of means. The post-hoc tests use paired mean value comparisons to provide information about which mean values ​​differ significantly from one another. Or they enable a statement to be made about which group mean values ​​are not significantly different through group-wise comparisons.

Overview of the post-hoc tests

The post-hoc tests differ in several criteria, e.g. B. are the sample sizes in all groups the same (balanced case) or not (unbalanced case) or is the variance the same in all groups (variance homogeneity) or not (variance heterogeneity). The homogeneity of variance can be checked with the Levene test .

test comparison of Homogeneity of variance Sample sizes
smallest significant difference Mean pairs No Unequal
Bonferroni test for smallest significant difference Mean pairs Yes Unequal
Šidák Mean pairs No
Tamhane Mean pairs No
Games-Howell Mean pairs No
Dunnett's Mean pairs No With small sample sizes
Dunnett's Mean pairs No With large sample sizes
Ryan-Einot-Gabriel-Welch spanned mean values Yes
Duncan spanned mean values Yes Equal
Tukey b spanned mean values Yes
Student-Newman-Keuls spanned mean values Yes Equal
Tukey spanned mean values Yes Equal
Hochberg spanned mean values Yes
Gabriel spanned mean values Yes
Scheffé Mean pairs Yes Unequal

The tests can be partially classified depending on how conservative they are:

Conservative - Duncan> Scheffé> Tukey> Newman-Keuls> smallest significant difference - not conservative .

Requirements and notation

It is assumed that the alternative hypothesis was accepted for the mean value comparisons in groups and at a level of significance . That is, there are differences between at least two group mean values. The hypotheses for all of the following tests are

* for the pairwise tests: vs. and
* for the spanned ordered mean values: vs. .

Furthermore, let the number of observations in the group and the number of all observations. The tests are divided into tests for the balanced case ( ) and for the unbalanced case (the sample sizes in the groups can be different).

Tests for the unbalanced case

Test for the smallest significant difference

In the test on the smallest significant difference ( least significant difference test , in short: LSD test ), also test smallest backed difference , or least significant difference test is the test statistic :

With

and the group variance of the group .

The least significant difference test is based on the two-sample t-test , but the variance is calculated using all groups.

Bonferroni test for smallest significant difference

In the Bonferroni test for the smallest significant difference, the test statistic is identical to the test statistic for the test for the smallest significant difference. However, the significance level is corrected using the Bonferroni method . If the analysis of variance is carried out with the significance level , then the corrected significance level is used for the pairwise mean value comparisons:

.

The critical values ​​for the corrected level of significance can be found in special tables or can be calculated using the approximation

to be determined. is the - quantile from the standard normal distribution .

The test should only be used if it is not too large , otherwise the corrected significance level will be too small and non-rejection areas of the t-tests will overlap. Is z. B. and , then is .

Scheffé test

The Scheffé test actually requires homogeneity of variance in the groups, but it is insensitive to the violation of this requirement.

Simple Scheffé test

The simple Scheffé test checks vs. with the help of the test statistics

.

The simple Scheffé test is a special case of the general Scheffé test for a linear contrast for two mean values.

Linear contrast

A linear contrast of one or more mean values ​​is defined as

with .

For the simple Scheffé test, the linear contrast is:

.

Two contrasts and are called orthogonal if applies

.

General Scheffé test

For the general Scheffé test, the hypotheses for all (orthogonal) contrasts vs. for at least one contrast. The test statistic results in

.

The idea is based on the variance decomposition of the estimated contrast

,

since under validity of the null hypothesis is true: .

Tests for the balanced case

These tests are intended for the balanced case; that is, the sample size in each group is the same . SPSS also performs the test if the sample sizes are unequal in each group, but it is then calculated as the harmonic mean of the sample sizes.

The test statistics are always the same for the following tests

.

The critical values are only available in tabular form (mostly for or ). There are between the mean values and further mean values.

Tukey test

The critical values ​​result from the Tukey test

,

d. In other words , there is no Bonferroni correction and the number of overstretched mean values ​​is not taken into account.

Student-Newman-Keuls test

In the Student-Newman-Keuls test, the critical values ​​result from

,

d. In other words , there is no Bonferroni correction and the number of overstretched mean values ​​is taken into account.

Duncan's test

The critical values ​​result from the Duncan test

,

d. That is, a Bonferroni correction takes place and the number of overstretched mean values ​​is taken into account.

When using the Duncan test, it should be noted that it only carries out group-wise comparisons, so that unambiguous statements about significance are not always possible.

example

Rent burden rate in%
state number Median medium Std.dev.
Saxony 1356 19.0 22.3 12.5
Brandenburg 803 19.0 23.4 13.2
Mecklenburg-Western Pomerania 491 20.0 22.1 10.3
Thuringia 744 21.0 24.0 13.3
Berlin 998 22.0 24.4 11.9
Baden-Württemberg 3246 22.0 24.8 14.2
Bavaria 3954 22.0 25.4 14.2
North Rhine-Westphalia 5266 23.0 25.8 13.8
Hesse 1904 23.0 26.3 14.3
Saxony-Anhalt 801 23.0 26.6 14.3
Rhineland-Palatinate 1276 24.0 26.1 13.5
Lower Saxony 2374 24.0 27.9 15.7
Hamburg 528 24.5 29.3 18.9
Schleswig-Holstein 890 25.0 27.9 14.8
Saarland 312 26.0 26.7 11.9
Bremen 194 27.0 29.2 15.8
Germany 9527 22.0 25.5 14.0

For the Mietbelastungsquote (= ratio of gross rental income on household income), taken from the CAMPUS Files for the microcensus 2002 of the Federal Statistical Office , give both the nonparametric median test and parametric way analysis of variance ( English one-way ANOVA ) highly significant differences in the medians or mean values ​​of the federal states. In other words, there are differences between the federal states in the mean rental expenditure (in relation to income).

Since the Levene test rejects the null hypothesis of homogeneity of variance and the observation numbers differ significantly in the sample, only the following test methods remain to determine the difference:

  • smallest significant difference
  • Bonferroni test for smallest significant difference
  • Scheffé

Since the Scheffé test in SPSS performs pairwise comparisons as well as outputs homogeneous subgroups, let's look at its results.

Pairwise comparisons

The pairwise comparison is used to provide information about significant differences between the mean values ​​of the individual groups. In the present example, for the respective pairwise comparisons for each combination of two federal states are output:

  • the difference ,
  • the standard error,
  • the p-value (column: significance ), which means a rejection of the equality of the mean values if the specified significance level is not reached, and
  • a 95% confidence interval for the difference in mean. If the confidence interval does not contain zero, the null hypothesis is rejected at the significance level of 5%.

At a given level of significance of 5%, only the mean values ​​for Schleswig-Holstein and Saxony are significant (p-value equals 2.1%), for all other comparisons with Schleswig-Holstein not.

ScheffePaar.PNG

Group comparisons

By means of the group-by-group comparison, detailed statements can be made about the homogeneity of the group mean values. However, this comparison allows only limited statements about the significant differences between the groups.

In the present example an iterative process is carried out to find homogeneous subgroups, i.e. H. Groups in which the null hypothesis of equality of means is not rejected. For this purpose, the observed mean values ​​are sorted according to size and a series of tests is carried out.

Overstretched
mean values
Tested null hypotheses
16
15th
14th
13
... In general, further tests are carried out with fewer and fewer groups
For example: not refused not rejected in previously contained declined

In the first step, the null hypothesis is tested and rejected; we already know that the mean values ​​are different. Then first

  • removed the state with the largest mean and tested the null hypothesis and
  • removed the state with the smallest mean and tested the null hypothesis .

In both tests, only groups with 15 federal states are tested. If the null hypothesis is rejected in one of the tests (red in the table), the state with the largest mean and the state with the smallest mean are removed from the group and the test is repeated. A sequence of null hypotheses to be tested is thus built up with an ever decreasing number of mean values.

The procedure is canceled if

  • either the null hypothesis cannot be rejected in one of the tests (green in the table) or
  • the considered null hypothesis is already part of a null hypothesis that has not been rejected (yellow in the table) or
  • only one state is left.

The "green" subgroups are issued by SPSS.

ScheffeGruppe.PNG

For the example there are two homogeneous subgroups with 14 federal states each. In other words, the null hypothesis of equality of the means could not be rejected here. Bremen and Hamburg are excluded from homogeneous subgroup 1, and Saxony and Mecklenburg-Western Pomerania are excluded from homogeneous subgroup 2. Statements about which mean values ​​of which federal states are significantly different cannot be made in this case.

Individual evidence

  1. ^ Ajit C. Tamhane: Multiple comparisons in model I one-way ANOVA with unequal variances . In: Communications in Statistics - Theory and Methods . tape 6 , no. 1 , 1977, pp. 15-32 , doi : 10.1080 / 03610927708827466 .
  2. Werner Timischl : Applied Statistics. An introduction for biologists and medical professionals. 2013, 3rd edition, p. 373.

literature

  • Bernd Rönz: script: Computational Statistics I . Humboldt University of Berlin, Chair of Statistics, Berlin 2001.