Principal component regression

from Wikipedia, the free encyclopedia

The principal component regression ( english principal component regression , or PCR) is a special regression analytical method based on the principal component analysis  is based (PCA).

Typically, a regression tries to explain a dependent variable in terms of a set of independent variables, such as: B. based on a simple linear regression model . The PCR uses the PCA to estimate the regression coefficients in an intermediate step .

The PCR is u. a. useful when the data matrix has a high degree of multicollinearity .

General procedure

Principal component regression can be roughly divided into three steps:

  • Perform a PCA on the explanatory variable data matrix to extract principal components. Only a subset of these is usually selected for further analysis using a suitable selection criterion.
  • The observed values ​​of the dependent variable are now regressed with these selected principal components. Ordinary least-squares estimation is used for this. The result is a vector of estimated regression coefficients (with the number of selected principal components as a dimension).
  • In the last step, this vector is transformed back in order to establish a reference to the original variable. This is done via the PCA charges ( eigenvalues ​​of the selected main components). This gives the final PCR estimator , the dimension of which again corresponds to the number of all independent variables.

literature

  • Faber, Klaas & Bruce R. Kowalski (1997). Propagation of measurement errors for the validation of predictions obtained by principal component regression and partial least squares. Journal of Chemometrics 11 (3), pp. 181-238, doi : 10.1002 / (SICI) 1099-128X (199705) 11: 3 <181 :: AID-CEM459> 3.0.CO; 2-7 .
  • Ian T. Jolliffe. A note on the use of principal components in regression (1982). Applied Statistics 31 (3), pp. 300-303, JSTOR 2348005 .
  • Tormod Næs & Harald Martens (1988). Principal component regression in NIR analysis: viewpoints, background details and selection of components. Journal of Chemometrics 2 (2), pp. 155-167, doi : 10.1002 / cem.1180020207 .
  • Jon M. Sutter, John H. Kalivas & Patrick M. Lang. Which principal components to utilize for principal component regression. Journal of Chemometrics 6 (4) 1992, pp. 217-225, doi : 10.1002 / cem.1180060406 .
  • RX Liu, J. Kuang, Q. Gong & XL Hou (2003). Principal component regression analysis with SPSS. Computer Methods and Programs in Biomedicine 71 (2), pp. 141-147, doi : 10.1016 / S0169-2607 (02) 00058-5 .