Post hoc test
Post-hoc tests are significance tests from mathematical statistics . The simple analysis of variance , the Kruskal-Wallis test, or the median test only determines that there are significant differences in a group of means. The post-hoc tests use paired mean value comparisons to provide information about which mean values differ significantly from one another. Or they enable a statement to be made about which group mean values are not significantly different through group-wise comparisons.
Overview of the post-hoc tests
The post-hoc tests differ in several criteria, e.g. B. are the sample sizes in all groups the same (balanced case) or not (unbalanced case) or is the variance the same in all groups (variance homogeneity) or not (variance heterogeneity). The homogeneity of variance can be checked with the Levene test .
test | comparison of | Homogeneity of variance | Sample sizes |
---|---|---|---|
smallest significant difference | Mean pairs | No | Unequal |
Bonferroni test for smallest significant difference | Mean pairs | Yes | Unequal |
Šidák | Mean pairs | No | |
Tamhane | Mean pairs | No | |
Games-Howell | Mean pairs | No | |
Dunnett's | Mean pairs | No | With small sample sizes |
Dunnett's | Mean pairs | No | With large sample sizes |
Ryan-Einot-Gabriel-Welch | spanned mean values | Yes | |
Duncan | spanned mean values | Yes | Equal |
Tukey b | spanned mean values | Yes | |
Student-Newman-Keuls | spanned mean values | Yes | Equal |
Tukey | spanned mean values | Yes | Equal |
Hochberg | spanned mean values | Yes | |
Gabriel | spanned mean values | Yes | |
Scheffé | Mean pairs | Yes | Unequal |
The tests can be partially classified depending on how conservative they are:
- Conservative - Duncan> Scheffé> Tukey> Newman-Keuls> smallest significant difference - not conservative .
Requirements and notation
It is assumed that the alternative hypothesis was accepted for the mean value comparisons in groups and at a level of significance . That is, there are differences between at least two group mean values. The hypotheses for all of the following tests are
* for the pairwise tests: | vs. and |
* for the spanned ordered mean values: | vs. . |
Furthermore, let the number of observations in the group and the number of all observations. The tests are divided into tests for the balanced case ( ) and for the unbalanced case (the sample sizes in the groups can be different).
Tests for the unbalanced case
Test for the smallest significant difference
In the test on the smallest significant difference ( least significant difference test , in short: LSD test ), also test smallest backed difference , or least significant difference test is the test statistic :
With
and the group variance of the group .
The least significant difference test is based on the two-sample t-test , but the variance is calculated using all groups.
Bonferroni test for smallest significant difference
In the Bonferroni test for the smallest significant difference, the test statistic is identical to the test statistic for the test for the smallest significant difference. However, the significance level is corrected using the Bonferroni method . If the analysis of variance is carried out with the significance level , then the corrected significance level is used for the pairwise mean value comparisons:
- .
The critical values for the corrected level of significance can be found in special tables or can be calculated using the approximation
to be determined. is the - quantile from the standard normal distribution .
The test should only be used if it is not too large , otherwise the corrected significance level will be too small and non-rejection areas of the t-tests will overlap. Is z. B. and , then is .
Scheffé test
The Scheffé test actually requires homogeneity of variance in the groups, but it is insensitive to the violation of this requirement.
Simple Scheffé test
The simple Scheffé test checks vs. with the help of the test statistics
- .
The simple Scheffé test is a special case of the general Scheffé test for a linear contrast for two mean values.
Linear contrast
A linear contrast of one or more mean values is defined as
- with .
For the simple Scheffé test, the linear contrast is:
- .
Two contrasts and are called orthogonal if applies
- .
General Scheffé test
For the general Scheffé test, the hypotheses for all (orthogonal) contrasts vs. for at least one contrast. The test statistic results in
- .
The idea is based on the variance decomposition of the estimated contrast
- ,
since under validity of the null hypothesis is true: .
Tests for the balanced case
These tests are intended for the balanced case; that is, the sample size in each group is the same . SPSS also performs the test if the sample sizes are unequal in each group, but it is then calculated as the harmonic mean of the sample sizes.
The test statistics are always the same for the following tests
- .
The critical values are only available in tabular form (mostly for or ). There are between the mean values and further mean values.
Tukey test
The critical values result from the Tukey test
- ,
d. In other words , there is no Bonferroni correction and the number of overstretched mean values is not taken into account.
Student-Newman-Keuls test
In the Student-Newman-Keuls test, the critical values result from
- ,
d. In other words , there is no Bonferroni correction and the number of overstretched mean values is taken into account.
Duncan's test
The critical values result from the Duncan test
- ,
d. That is, a Bonferroni correction takes place and the number of overstretched mean values is taken into account.
When using the Duncan test, it should be noted that it only carries out group-wise comparisons, so that unambiguous statements about significance are not always possible.
example
state | number | Median | medium | Std.dev. |
---|---|---|---|---|
Saxony | 1356 | 19.0 | 22.3 | 12.5 |
Brandenburg | 803 | 19.0 | 23.4 | 13.2 |
Mecklenburg-Western Pomerania | 491 | 20.0 | 22.1 | 10.3 |
Thuringia | 744 | 21.0 | 24.0 | 13.3 |
Berlin | 998 | 22.0 | 24.4 | 11.9 |
Baden-Württemberg | 3246 | 22.0 | 24.8 | 14.2 |
Bavaria | 3954 | 22.0 | 25.4 | 14.2 |
North Rhine-Westphalia | 5266 | 23.0 | 25.8 | 13.8 |
Hesse | 1904 | 23.0 | 26.3 | 14.3 |
Saxony-Anhalt | 801 | 23.0 | 26.6 | 14.3 |
Rhineland-Palatinate | 1276 | 24.0 | 26.1 | 13.5 |
Lower Saxony | 2374 | 24.0 | 27.9 | 15.7 |
Hamburg | 528 | 24.5 | 29.3 | 18.9 |
Schleswig-Holstein | 890 | 25.0 | 27.9 | 14.8 |
Saarland | 312 | 26.0 | 26.7 | 11.9 |
Bremen | 194 | 27.0 | 29.2 | 15.8 |
Germany | 9527 | 22.0 | 25.5 | 14.0 |
For the Mietbelastungsquote (= ratio of gross rental income on household income), taken from the CAMPUS Files for the microcensus 2002 of the Federal Statistical Office , give both the nonparametric median test and parametric way analysis of variance ( English one-way ANOVA ) highly significant differences in the medians or mean values of the federal states. In other words, there are differences between the federal states in the mean rental expenditure (in relation to income).
Since the Levene test rejects the null hypothesis of homogeneity of variance and the observation numbers differ significantly in the sample, only the following test methods remain to determine the difference:
- smallest significant difference
- Bonferroni test for smallest significant difference
- Scheffé
Since the Scheffé test in SPSS performs pairwise comparisons as well as outputs homogeneous subgroups, let's look at its results.
Pairwise comparisons
The pairwise comparison is used to provide information about significant differences between the mean values of the individual groups. In the present example, for the respective pairwise comparisons for each combination of two federal states are output:
- the difference ,
- the standard error,
- the p-value (column: significance ), which means a rejection of the equality of the mean values if the specified significance level is not reached, and
- a 95% confidence interval for the difference in mean. If the confidence interval does not contain zero, the null hypothesis is rejected at the significance level of 5%.
At a given level of significance of 5%, only the mean values for Schleswig-Holstein and Saxony are significant (p-value equals 2.1%), for all other comparisons with Schleswig-Holstein not.
Group comparisons
By means of the group-by-group comparison, detailed statements can be made about the homogeneity of the group mean values. However, this comparison allows only limited statements about the significant differences between the groups.
In the present example an iterative process is carried out to find homogeneous subgroups, i.e. H. Groups in which the null hypothesis of equality of means is not rejected. For this purpose, the observed mean values are sorted according to size and a series of tests is carried out.
Overstretched mean values |
Tested null hypotheses | ||||||
---|---|---|---|---|---|---|---|
16 | |||||||
15th | |||||||
14th | |||||||
13 | |||||||
... | In general, further tests are carried out with fewer and fewer groups | ||||||
For example: | not refused | not rejected in previously contained | declined |
In the first step, the null hypothesis is tested and rejected; we already know that the mean values are different. Then first
- removed the state with the largest mean and tested the null hypothesis and
- removed the state with the smallest mean and tested the null hypothesis .
In both tests, only groups with 15 federal states are tested. If the null hypothesis is rejected in one of the tests (red in the table), the state with the largest mean and the state with the smallest mean are removed from the group and the test is repeated. A sequence of null hypotheses to be tested is thus built up with an ever decreasing number of mean values.
The procedure is canceled if
- either the null hypothesis cannot be rejected in one of the tests (green in the table) or
- the considered null hypothesis is already part of a null hypothesis that has not been rejected (yellow in the table) or
- only one state is left.
The "green" subgroups are issued by SPSS.
For the example there are two homogeneous subgroups with 14 federal states each. In other words, the null hypothesis of equality of the means could not be rejected here. Bremen and Hamburg are excluded from homogeneous subgroup 1, and Saxony and Mecklenburg-Western Pomerania are excluded from homogeneous subgroup 2. Statements about which mean values of which federal states are significantly different cannot be made in this case.
Individual evidence
- ^ Ajit C. Tamhane: Multiple comparisons in model I one-way ANOVA with unequal variances . In: Communications in Statistics - Theory and Methods . tape 6 , no. 1 , 1977, pp. 15-32 , doi : 10.1080 / 03610927708827466 .
- ↑ Werner Timischl : Applied Statistics. An introduction for biologists and medical professionals. 2013, 3rd edition, p. 373.
literature
- Bernd Rönz: script: Computational Statistics I . Humboldt University of Berlin, Chair of Statistics, Berlin 2001.