The scree test , also known as the elbow criterion, is a graphical method for determining the optimal number of factors in factor analysis . The criterion was developed in the 1960s by the American psychologist Raymond Bernard Cattell and is still used today due to its simplicity.
Background to factor selection
In the factor analysis, only those factors should be extracted that explain a significant part of the variance and therefore have a high eigenvalue . This is the case with the first factor, as a rule also with a few other factors, although the eigenvalues usually decrease sharply. From a certain factor the additional variance, which is explained by each additional factor, remains at a low level.
The selection of the factors is primarily used to obtain meaningful, easily interpretable results and can therefore only be objectified to a limited extent.
The basic assumption is that only those factors are significant that represent a stronger correlation than the correlation of random numbers. The scree test now makes use of the fact that - in contrast to the eigenvalues of correlated data - the eigenvalues of random numbers are typically approximately constant.
To use the scree test , the descending sorted eigenvalues of the possible factors are considered in a so-called eigenvalue diagram or also a scree plot .
After the eigenvalues of the correlated data initially drop steeply, a kink ("elbow") is typically evident. The values to the right of it stagnate almost at a low level; they are considered insignificant because they are roughly at (or even below) the level of random correlations.
The eigenvalues to the left of the kink, on the other hand, are considered significant and must be extracted in the factor analysis. If there are several kinks, the stronger or further to the right kink must be taken into account. If there is no kink, the elbow criterion does not help.
Criticism and further developments
Criticism of objectivity
The scree test, first published in Cattell (1966), is often criticized for its poor objectivity. If no clear kink can be identified, there is room for interpretation.
The modification presented by JL Horn (1965) - often referred to as parallel analysis - superimposes a second eigenvalue diagram over the eigenvalue diagram of the correlated data. Only those eigenvalues examined that are higher than the random eigenvalues are considered significant. Despite the strong similarity, the two variants often produce different results. Although Horn's modification is objectively applicable, it has never been able to displace Cattell's scree test.
The second eigenvalues are calculated assuming that
- the variables are uncorrelated, d. H. the correlation or covariance matrix is a diagonal matrix and
- the data are multivariate normally distributed.
On this basis, B random data sets with the same number of variables and observations as the data set under consideration are generated and the eigenvalues of the associated empirical correlation or covariance matrix are calculated. The B largest eigenvalues approximate the distribution of the largest eigenvalue, the B second largest eigenvalues approximate the distribution of the second largest eigenvalue, ... Then z. B. the 95% quantile of the B largest eigenvalues is taken as the limit for the largest eigenvalue. If the greatest eigenvalue of the data is greater than , this eigenvalue is significant. The second figure on the right shows the Horn criterion for the Boston Housing data set with the falling gray line.
Standard Error Scree
In addition to further developments and improvements made in the following decades, Zoski and Jurs (1996) present a standard error screen .
Criticism of the basic assumption
The basic assumption that eigenvalues are meaningless below random eigenvalues has also been questioned by some scientists. Equating them with random or error results simply because of their size is inadmissible.
As an alternative, the more rigid Kaiser-Guttman criterion comes into question, but this sometimes leads to solutions that are difficult to interpret.
In principle, several criteria should be used. In case of doubt, in particular, it is advisable to calculate several factor numbers and check them with regard to charges and interpretability.
- Cattell, RB (1966). The scree test for the number of factors. Multivariate Behavioral Research 1 , 245-276, doi : 10.1207 / s15327906mbr0102_10 .
- Horn, JL (1965). A rational and test for the number of factors in factor analysis. Psychometrika, 30 , 179-185, doi : 10.1007 / BF02289447 .
- Zoski, Keith W., Jurs Steven G. (1996): An objective counterpart to the visual scree test for factor analysis: The standard error scree. Educational and Psychological Measurement, 56 , 443-451, doi : 10.1177 / 0013164496056003006 .
- Bortz, J. & Schuster, C. (2010). Factor analysis. In: Statistics for human and social scientists . 7th edition (pp. 385-433). Berlin and Heidelberg: Springer, ISBN 978-3-642-12769-4 , doi : 10.1007 / 978-3-642-12770-0_23 .