# Prediction matrix

In the statistics is the prediction matrix ( English prediction matrix ) has a symmetrical and idempotent matrix and a projection matrix . The prediction matrix is sometimes hat matrix or roof matrix called because it on maps. Accordingly, it is either noted with or . The term “prediction matrix” or “prediction matrix” was coined by Hoaglin & Welsh (1978) and Chatterjee & Hadi (1986) and stems from the fact that if the matrix is ​​applied to the values, it generates the predicted values ​​( values). Another matrix that is important in statistics is the residual matrix, which is defined by the prediction matrix and is also a projection matrix. ${\ displaystyle y}$${\ displaystyle {\ hat {y}}}$${\ displaystyle \ mathbf {P}}$${\ displaystyle \ mathbf {H}}$${\ displaystyle y}$${\ displaystyle {\ hat {y}}}$

## definition

Given a typical multiple linear regression model , with the vector of the unknown regression parameters , the experiment plan matrix , the vector of the dependent variables and the vector of the disturbance variables . Then the prediction matrix is defined by ${\ displaystyle \ mathbf {y} = \ mathbf {X} {\ boldsymbol {\ beta}} + {\ boldsymbol {\ varepsilon}}}$${\ displaystyle {\ boldsymbol {\ beta}}}$${\ displaystyle p \ times 1}$${\ displaystyle n \ times p}$ ${\ displaystyle \ mathbf {X}}$${\ displaystyle n \ times 1}$${\ displaystyle \ mathbf {y}}$${\ displaystyle n \ times 1}$ ${\ displaystyle {\ boldsymbol {\ varepsilon}}}$

${\ displaystyle \ mathbf {P} \ equiv \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top } \ quad}$with .${\ displaystyle \ quad \ mathbf {P} \ in \ mathbb {R} ^ {n \ times n}}$

The matrix is also called the Moore-Penrose inverse of . ${\ displaystyle \ mathbf {X} ^ {+} = \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top}}$${\ displaystyle \ mathbf {X}}$

The regression (hyper) plane estimated using the least squares method is then given by the sample regression function , where is the least squares estimate vector. The prediction matrix is ​​the matrix of the orthogonal projection onto the column space of and has a maximum of rank ( is the number of parameters of the regression model). If there is a matrix with , then is . Since is a projection matrix, holds . The idempotency and symmetry properties ( and ) imply that there is an orthogonal projector on the column space . The direction of projection results from the matrix , the columns of which are perpendicular to . The matrix is called the prediction matrix because the prediction values ​​are obtained by multiplying the vector by this matrix on the left . This can be shown by inserting the KQ parameter estimator as follows: ${\ displaystyle {\ hat {\ mathbf {y}}} = {\ widehat {\ operatorname {E} (\ mathbf {y})}} = \ mathbf {X} {\ hat {\ varvec {\ beta}} }}$${\ displaystyle {\ hat {\ varvec {\ beta}}} = \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {y}}$${\ displaystyle \ mathbf {P}}$${\ displaystyle \ mathbf {X}}$${\ displaystyle p}$${\ displaystyle p = k + 1}$${\ displaystyle \ mathbf {X}}$${\ displaystyle (n \ times p)}$${\ displaystyle \ operatorname {rank} (\ mathbf {X}) = p}$${\ displaystyle \ operatorname {rank} (\ mathbf {P}) = p}$${\ displaystyle \ mathbf {P}}$${\ displaystyle \ operatorname {rank} (\ mathbf {P}) = \ operatorname {track} (\ mathbf {P}) = p}$${\ displaystyle \ mathbf {P} \ cdot \ mathbf {P} = \ mathbf {P}}$${\ displaystyle \ mathbf {P} ^ {\ top} = \ mathbf {P}}$${\ displaystyle \ mathbf {P}}$${\ displaystyle S (\ mathbf {X}) = S (\ mathbf {P})}$${\ displaystyle (\ mathbf {I} - \ mathbf {P})}$${\ displaystyle S (\ mathbf {X})}$${\ displaystyle \ mathbf {P}}$${\ displaystyle {\ hat {\ mathbf {y}}}}$${\ displaystyle \ mathbf {y}}$

${\ displaystyle {\ hat {\ mathbf {y}}} = \ mathbf {X} {\ hat {\ boldsymbol {\ beta}}} = \ underbrace {\ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top}} _ {= \ mathbf {P}} \ mathbf {y} = \ mathbf {P} \ mathbf {y}}$.

The predictive values of (the values) can thus be understood as a function of the observed values. Numerous statistical results can also be represented with the prediction matrix. For example, the Residualvektor can by means of the prediction matrix represented as: . The (non-trivial) covariance matrix of the residual vector is and plays a role in the analysis of leverage values . ${\ displaystyle y}$${\ displaystyle {\ hat {y}}}$${\ displaystyle y}$${\ displaystyle {\ hat {\ boldsymbol {\ varepsilon}}} = \ mathbf {y} - {\ hat {\ mathbf {y}}} = \ mathbf {y} - \ mathbf {X} {\ hat {\ boldsymbol {\ beta}}} = (\ mathbf {I} - \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X } ^ {\ top}) \ mathbf {y} = (\ mathbf {I} - \ mathbf {P}) \ mathbf {y}}$${\ displaystyle \ operatorname {Cov} ({\ hat {\ boldsymbol {\ varepsilon}}}) = \ sigma ^ {2} (\ mathbf {I} - \ mathbf {P})}$

## properties

### Idempotence

The prediction matrix is ​​idempotent. This can be interpreted to mean that “applying the regression twice leads to the same result”. The idempotential property of the prediction matrix can be shown as follows:

${\ displaystyle \ mathbf {P} ^ {2} = \ mathbf {P} \ cdot \ mathbf {P} = \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top} = \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {I} \ mathbf {X} ^ {\ top} = \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top} = \ mathbf {P}}$,

where is the identity matrix . ${\ displaystyle \ mathbf {I}}$

### symmetry

The prediction matrix is ​​symmetrical. The symmetry property of the prediction matrix can be shown as follows

${\ displaystyle \ mathbf {P} ^ {\ top} = \ left (\ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top} \ right) ^ {\ top} = \ left (\ left (\ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {-1} \ right) \ left (\ mathbf {X} ^ {\ top} \ right) \ right) ^ {\ top} = \ \ left (\ mathbf {X} ^ {\ top} \ right) ^ {\ top} \ left (\ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ right) ^ {\ top} = \ \ mathbf {X} \ left (\ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ right) ^ {\ top} \ mathbf {X} ^ {\ top} = \ \ mathbf {X} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {X} ^ {\ top} = \ mathbf {P}}$

## Leverage values

The diagonal elements of the prediction matrix can be interpreted as leverage values ​​and play a major role in regression diagnostics . You are given by ${\ displaystyle p_ {ii}}$${\ displaystyle \ mathbf {P}}$

${\ displaystyle p_ {ii} = \ mathbf {x} _ {i} ^ {\ top} \ left (\ mathbf {X} ^ {\ top} \ mathbf {X} \ right) ^ {- 1} \ mathbf {x} _ {i}}$.

These leverage values ​​are used in calculating Cook's Distance and can be used to identify influential observations . It holds , where represents the number of rows in the design matrix that are different. If all lines are different, then applies . ${\ displaystyle {\ frac {1} {n}} \ leq p_ {ii} \ leq {\ frac {1} {r}}}$${\ displaystyle r}$${\ displaystyle \ mathbf {X}}$${\ displaystyle {\ frac {1} {n}} \ leq p_ {ii} \ leq 1}$

## Individual evidence

1. David C. Hoaglin & Roy E. Welsch: The Hat Matrix in Regression and ANOVA. In: The American Statistician, 32 (1), 1978, pp. 17-22, doi : 10.1080 / 00031305.1978.10479237 , JSTOR 2683469 .
2. a b Samprit Chatterjee & Ali S. Hadi: Influential observations, high leverage points, and outliers in linear regression. In: Statistical Science, 1 (3), 1986, pp. 379-393, doi : 10.1214 / ss / 1177013622 , JSTOR 2245477 .
3. ^ Wilhelm Caspary: Error-tolerant evaluation of measurement data , p. 124
4. ^ Rainer Schlittgen : Regression analyzes with R. , ISBN 978-3-486-73967-1 , p. 27 (accessed via De Gruyter Online).
5. ^ Ludwig Fahrmeir , Thomas Kneib , Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 122.
6. ^ Ludwig Fahrmeir, Thomas Kneib, Stefan Lang, Brian Marx: Regression: models, methods and applications. Springer Science & Business Media, 2013, ISBN 978-3-642-34332-2 , p. 108.