Point bisiserial correlation

from Wikipedia, the free encyclopedia

As a point-biserial correlation coefficient is correlation coefficient of the relationship between an interval scaled characteristic and a dichotomous ( Bernoulli distributed ) feature called. It is not an independent measure, but a special case of the usual Pearson correlation coefficient, which in this case can be calculated as

,

where the sum of squares , the sample size, the proportion of the examination units with the property recorded in D and the proportion of the examination units without the characteristic recorded in D denotes.

Derivation from the Pearson correlation

For the sake of simplicity, it is assumed that the dichotomous feature takes on the values ​​0 and 1, so that the mean value in is equal to . The correlation between and over is calculated according to the general formula

.

A distinction can now be made between cases: units of investigation are D = 1 and are above the mean value in D, the other units of investigation are D = 0 and are below the mean value in D. This applies

,

what is about

can be simplified to the above equation.

Use in common statistics software

SPSS and R automatically use the point-to-point calculation method if the commands CORRELATEor cor, are cor.testrequested and one of the variables has only two characteristics (e.g. the values ​​0 and 1) that are also considered relevant to the calculation (−7 or 99 e.g. B. can be marked as missing values ​​in SPSS and thus ignored).

literature

  • Jürgen Bortz: Statistics for human and social scientists. 6th edition. Springer, Berlin a. a. 2005, ISBN 3-540-21271-X .
  • J. Cohen, P. Cohen, SG West, LS Aiken: Applied Multiple Regression / Correlation Analysis For The Behavioral Sciences. London 2003, ISBN 0-8058-2223-2 .