Point bisiserial correlation
As a point-biserial correlation coefficient is correlation coefficient of the relationship between an interval scaled characteristic and a dichotomous ( Bernoulli distributed ) feature called. It is not an independent measure, but a special case of the usual Pearson correlation coefficient, which in this case can be calculated as
- ,
where the sum of squares , the sample size, the proportion of the examination units with the property recorded in D and the proportion of the examination units without the characteristic recorded in D denotes.
Derivation from the Pearson correlation
For the sake of simplicity, it is assumed that the dichotomous feature takes on the values 0 and 1, so that the mean value in is equal to . The correlation between and over is calculated according to the general formula
- .
A distinction can now be made between cases: units of investigation are D = 1 and are above the mean value in D, the other units of investigation are D = 0 and are below the mean value in D. This applies
- ,
what is about
can be simplified to the above equation.
Use in common statistics software
SPSS and R automatically use the point-to-point calculation method if the commands CORRELATE
or cor
, are cor.test
requested and one of the variables has only two characteristics (e.g. the values 0 and 1) that are also considered relevant to the calculation (−7 or 99 e.g. B. can be marked as missing values in SPSS and thus ignored).
literature
- Jürgen Bortz: Statistics for human and social scientists. 6th edition. Springer, Berlin a. a. 2005, ISBN 3-540-21271-X .
- J. Cohen, P. Cohen, SG West, LS Aiken: Applied Multiple Regression / Correlation Analysis For The Behavioral Sciences. London 2003, ISBN 0-8058-2223-2 .