Projection matrix (statistics)

from Wikipedia, the free encyclopedia

In statistics , a projection matrix is a symmetrical and idempotent matrix. Furthermore, all eigenvalues ​​of a projection matrix are either 0 or 1 and the rank and trace of a projection matrix are identical. The only nonsingular projection matrix is ​​the identity matrix . All other projection matrices are singular. The most important projection matrices in statistics are the prediction matrix and the residue-generating matrix or residual matrix . They are an example of an orthogonal projection in the sense of linear algebra , where every vector of a vector space with a scalar product for a given projection matrix can be unambiguously decomposed according to . Another important projection matrix in statistics is the centering matrix .

Starting position

As a starting point, we consider a typical multiple linear regression model with given data for statistical units and regressors. The relationship between the dependent variable and the independent variables can be shown as follows

.

In matrix notation too

with . In compact notation

.

Here represents a vector of unknown parameters (known as regression coefficients ) that must be estimated from the data. It is also assumed that the error terms are zero on average: which means that we can assume that our model is correct on average.

Prediction matrix

One of the most important projection matrices in statistics is the prediction matrix . The prediction matrix is defined as follows

with ,

where the data matrix represents. The diagonal elements of the prediction matrix are named and can be interpreted as leverage values.

Residual generating matrix

The residuenerzeugende matrix ( English residual matrix-maker ), and residue-generating matrix , residual matrix defined as follows

,

where P represents the prediction matrix. The name residue-generating matrix results from the fact that this projection matrix multiplied by the y-vector results in the residual vector . This can be expressed compactly by the prediction matrix as follows

.

In linear models, the rank and trace of a projection matrix are identical. The following applies to the rank of the residue-generating matrix

Idempotence

The idempotential property of the residue generating matrix can be shown as follows

symmetry

The symmetry of the residue generating matrix follows directly from the symmetry of the prediction matrix and can be shown as follows

Other properties

The projection matrix has a wealth of useful algebraic properties. In the language of linear algebra, the projection matrix is ​​an orthogonal projection onto the column space of the data matrix . Further properties of the projection matrices are summarized below:

  • and
  • is invariant under  : consequently .
  • ("Applying the regression to the residuals yields ")
  • is unique for a certain subspace
  • All eigenvalues ​​of a projection matrix are either 0 or 1

Applications

Estimation of the variance parameter using least squares estimation

The residual sum , short SQR ( S umme the Q uadrate the R estabweichungen (or "residuals") or English sum of squared residuals , short SSR ) results in matrix notation

.

This can also be written as

.

An unbiased estimate of the variance of the disturbance variables is the " mean residual square ":

.

With the help of the residual-generating matrix, the variance of the error terms can also be written as

.

Individual evidence

  1. Alexander Basilevsky: Applied Matrix Algebra in the Statistical Sciences . Dover, 2005, ISBN 0-486-44538-0 , pp. 160-176.
  2. ^ Wilhelm Caspary: Error-tolerant evaluation of measurement data , ".124
  3. Peter Hackl : Introduction to Econometrics. 2nd updated edition, Pearson Deutschland GmbH, 2008., ISBN 978-3-86894-156-2 , p. 75.
  4. ^ P. Gans: Data Fitting in the Chemical Sciences . Wiley, 1992, ISBN 0-471-93412-7 .
  5. ^ NR Draper, H. Smith: Applied Regression Analysis . Wiley, 1998, ISBN 0-471-17082-8 .