Quantile-quantile diagram
A quantile-quantile diagram , short QQ-diagram ( English quantile-quantile plot , short QQ plot ) is an exploratory, graphical tool, in which the quantiles of two statistical variables against each other ablated to their distribution to be compared.
A PP diagram or probability-probability plot is an exploratory, graphic tool in which the distribution functions of two statistical variables are plotted against each other in order to compare their distributions .
QQ diagram
Comparison of the distribution of two statistical features
The observation values of two characteristics whose distribution one wants to compare are each sorted according to size . This ordered data is combined to form pairs of values and mapped out in a coordinate system . If the points produce (approximately) a straight line, one can assume that the two features are based on the same distribution. The procedure is problematic if there are different numbers of observations of the two characteristics. Here, with interpolation to be addressed.
An example is given here for approx. 110 warships at the outbreak of the Second World War. The variables length and width were recorded. The scatter plot shows that there are obviously two different groups that clearly stand out as clusters. The data for the quantile-quantile diagram have been standardized to facilitate comparability. You can see the breakdown of the data into two clusters at the gap in the point curve. For the lower left cluster, the type of distribution appears to be the same for both variables. For the second cluster at the top right, the width tends to be larger compared to the first cluster. The "bulge" of the plot shows that the distributions of length and width are unequal.
Checking the distribution of a feature
The observation values of a feature are sorted according to size. The quantiles of the theoretical distribution that belong to the corresponding distribution value serve as a comparison. If the feature values come from the comparative distribution, the empirical and theoretical quantiles approximately match, ie the values lie on a diagonal.
Large systematic deviations from this diagonal indicate that the theoretical and empirical distribution differ from one another. However, the quantile-quantile diagram cannot replace a distribution test .
Formal definition
For each of the observations , an empirical undershoot portion is determined. With the help of the inverse distribution function (or quantile function ) of the theoretical distribution, the quantile
calculated. The plotting is now versus .
The calculation of the undershoot portion is done with the help of the observation rank :
method | Formula for | For | |
---|---|---|---|
Blom | |||
Rankit | |||
Tukey | |||
Van der Waerden |
Trend-adjusted QQ diagram
In the trend-adjusted quantile-quantile diagram, the points are plotted instead of . If the empirical and theoretical distribution match, then all points are present . The deviations come only from the differences between the theoretical and empirical distribution. In the quantile-quantile plot, the points in the diagram always go from bottom left to top right, i.e. H. Deviations between the theoretical and empirical distribution are shown here in relation to the range of values of the theoretical and empirical distribution. The trended QQ diagram therefore offers a better view of the structure of the deviations than the QQ diagram.
PP diagram
Checking the distribution of a feature
For the observation values, the underflow proportions according to Blom etc. are calculated. For the distribution to be compared, the observed values are inserted into the cumulative theoretical distribution function. This is how you get the theoretical underflow rate . If the characteristic values come from the comparison distribution, the values of and approximately match, ie the values lie on a diagonal.
In contrast to the QQ diagram, the edges of the distribution in the PP diagram have less of a visual impact. However, the probability-probability plot cannot replace a distribution test .
Trend-adjusted PP diagram
In the trend-adjusted probability-probability plot, the points are plotted instead of . If the empirical and theoretical distribution match, then all points are present . As with the trended QQ diagram, this graphic provides a better overview of the deviations.
Application examples
- Comparison of an empirical frequency distribution with a theoretical or hypothetical distribution:
- Graphical inspection of regression residuals for normal distribution
- Optical testing of distribution requirements before performing a parametric test procedure
literature
- Hartung, Joachim, Elpelt, Bärbel, Klösener, Karl-Heinz: Statistics. Munich 2002
- JM Chambers, WS Cleveland, Beat Kleiner, Paul A. Tukey: Graphical Methods for Data Analysis. Wadsworth, 1983.
Individual evidence
- ↑ Peter P. Eckstein: Applied Statistics with SPSS , p. 97