Prediction matrix

from Wikipedia, the free encyclopedia

In the statistics is the prediction matrix ( English prediction matrix ) has a symmetrical and idempotent matrix and a projection matrix . The prediction matrix is sometimes hat matrix or roof matrix called because it on maps. Accordingly, it is either noted with or . The term “prediction matrix” or “prediction matrix” was coined by Hoaglin & Welsh (1978) and Chatterjee & Hadi (1986) and stems from the fact that if the matrix is ​​applied to the values, it generates the predicted values ​​( values). Another matrix that is important in statistics is the residual matrix, which is defined by the prediction matrix and is also a projection matrix.

definition

Given a typical multiple linear regression model , with the vector of the unknown regression parameters , the experiment plan matrix , the vector of the dependent variables and the vector of the disturbance variables . Then the prediction matrix is defined by

with .

The matrix is also called the Moore-Penrose inverse of .

The regression (hyper) plane estimated using the least squares method is then given by the sample regression function , where is the least squares estimate vector. The prediction matrix is ​​the matrix of the orthogonal projection onto the column space of and has a maximum of rank ( is the number of parameters of the regression model). If there is a matrix with , then is . Since is a projection matrix, holds . The idempotency and symmetry properties ( and ) imply that there is an orthogonal projector on the column space . The direction of projection results from the matrix , the columns of which are perpendicular to . The matrix is called the prediction matrix because the prediction values ​​are obtained by multiplying the vector by this matrix on the left . This can be shown by inserting the KQ parameter estimator as follows:

.

The predictive values of (the values) can thus be understood as a function of the observed values. Numerous statistical results can also be represented with the prediction matrix. For example, the Residualvektor can by means of the prediction matrix represented as: . The (non-trivial) covariance matrix of the residual vector is and plays a role in the analysis of leverage values .

properties

Idempotence

The prediction matrix is ​​idempotent. This can be interpreted to mean that “applying the regression twice leads to the same result”. The idempotential property of the prediction matrix can be shown as follows:

,

where is the identity matrix .

symmetry

The prediction matrix is ​​symmetrical. The symmetry property of the prediction matrix can be shown as follows

Leverage values

The diagonal elements of the prediction matrix can be interpreted as leverage values ​​and play a major role in regression diagnostics . You are given by

.

These leverage values ​​are used in calculating Cook's Distance and can be used to identify influential observations . It holds , where represents the number of rows in the design matrix that are different. If all lines are different, then applies .

Individual evidence

  1. David C. Hoaglin & Roy E. Welsch: The Hat Matrix in Regression and ANOVA. In: The American Statistician, 32 (1), 1978, pp. 17-22, doi : 10.1080 / 00031305.1978.10479237 , JSTOR 2683469 .
  2. a b Samprit Chatterjee & Ali S. Hadi: Influential observations, high leverage points, and outliers in linear regression. In: Statistical Science, 1 (3), 1986, pp. 379-393, doi : 10.1214 / ss / 1177013622 , JSTOR 2245477 .
  3. ^ Wilhelm Caspary: Error-tolerant evaluation of measurement data , p. 124
  4. ^ Rainer Schlittgen : Regression analyzes with R. , ISBN 978-3-486-73967-1 , p. 27 (accessed via De Gruyter Online).
  5. ^ Ludwig Fahrmeir , Thomas Kneib , Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 122.
  6. ^ Ludwig Fahrmeir, Thomas Kneib, Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 108.