# Two-stage least squares estimation

In statistics and econometrics which is two-stage least squares estimation or two-stage least squares estimation ( ZSKQ estimate ), and two-stage least squares ( English Two Stage Least Squares , short TSLS or 2SLS ), is established by the econometricians Estimation method developed by Henri Theil with limited information . In this two-step process, the endogenous (i.e., the variables correlated with the disturbance variable) are first regressed on all exogenous variables of the equation and all instruments . Second, the estimated values ​​obtained in this way for the endogenous regressors , which as a linear combination of exogenous variables are not correlated with the disturbance term , are then inserted into the original model and the resulting new model is estimated. The two-stage least squares estimator can be interpreted as an instrument variable estimator . The ZSKQ estimate is second to the common least squares method in estimating linear equations in applied econometrics.

## The procedure

Consider a typical multiple linear regression model ( real model ), with the vector of unknown regression parameters , which - experimental design matrix , the vector of the dependent variable and the vector of disturbances . The generalized least squares (VKQ) estimator can be expressed in different ways. Each of these expressions has its own interpretation. A well-known specification  is the so - called two - stage least-squares estimation , which was developed by Henri Theil. For the derivation of the two-stage least squares estimator , the generalized least squares estimator can be expressed as follows: ${\ displaystyle \ mathbf {y} = \ mathbf {X} {\ boldsymbol {\ beta}} + {\ boldsymbol {\ varepsilon}}}$${\ displaystyle {\ boldsymbol {\ beta}}}$${\ displaystyle (p \ times 1)}$${\ displaystyle (n \ times p)}$ ${\ displaystyle \ mathbf {X}}$${\ displaystyle (n \ times 1)}$ ${\ displaystyle \ mathbf {y}}$${\ displaystyle (n \ times 1)}$ ${\ displaystyle {\ boldsymbol {\ varepsilon}}}$${\ displaystyle {\ tilde {\ boldsymbol {\ delta}}} _ {i}}$

${\ displaystyle {\ tilde {\ boldsymbol {\ delta}}} _ {i} = (\ mathbf {Z} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top } \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {Z} _ {i}) ^ {- 1} \ mathbf {Z} _ {i} ^ {\ top } \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {y} _ {i} = {\ begin {pmatrix} \ mathbf {Y} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {Y} _ {i} & \ mathbf {Y} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X} ) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {X} _ {i} \\\\\ mathbf {X} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {Y} _ {i} & \ mathbf {X} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {X} _ {i} \\\\\ end {pmatrix}} ^ {- 1} \ cdot {\ begin {pmatrix} \ mathbf {Y} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ { \ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {y} _ {i} \\\\\ mathbf {X} _ {i} ^ {\ top } \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ to p} \ mathbf {y} _ {i} \\\\\ end {pmatrix}}}$

The reduced form is . The -th equation of the reduced form can be partitioned as follows: ${\ displaystyle \ mathbf {Y} = \ mathbf {X} \ mathbf {\ Pi} + \ mathbf {V}}$${\ displaystyle i}$

${\ displaystyle [\ mathbf {y} _ {i} \; \; \ mathbf {Y} _ {i} \; \; \ mathbf {Y} _ {i} ^ {*}] = \ mathbf {X} [\ pi _ {i} \; \; \ mathbf {\ Pi} \; \; \ mathbf {\ Pi} _ {i} ^ {*}] + [\ mathbf {v} _ {i} \; \ ; \ mathbf {V} _ {i} \; \; \ mathbf {V} _ {i} ^ {*}]}$,

where the -vector is the -th commonly dependent variable that includes other commonly dependent variables in the -th equation, the -Matrix of the commonly dependent variables not appearing in the -th equation, and the partitioned matrix of coefficients of the reduced ones Shape is. The least squares estimator of is and therefore holds with the aid of the prediction matrix , where is the matrix of the predicted values ​​of . By the fact that , one can also write: ${\ displaystyle \ mathbf {y} _ {i}}$${\ displaystyle (T \ times 1)}$${\ displaystyle i}$${\ displaystyle \ mathbf {Y} _ {i} ^ {*}}$${\ displaystyle i}$${\ displaystyle \ mathbf {Y} _ {i} ^ {*}}$${\ displaystyle (T \ times m_ {i} ^ {*})}$${\ displaystyle i}$${\ displaystyle [\ pi _ {i} \; \; \ mathbf {\ Pi} \; \; \ mathbf {\ Pi} _ {i} ^ {*}]}$${\ displaystyle \ mathbf {\ Pi} _ {i}}$${\ displaystyle {\ hat {\ mathbf {\ Pi}}} _ {i} = (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {Y} _ {i}}$ ${\ displaystyle \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {Y} _ {i} = \ mathbf {X} {\ hat {\ mathbf {\ Pi}}} _ {i} = {\ hat {\ mathbf {Y}}} _ {i} = \ mathbf {Y} _ {i} - {\ has {\ mathbf {V}}} _ {i}}$${\ displaystyle {\ hat {\ mathbf {Y}}} _ {i}}$${\ displaystyle (T \ times (m_ {i} -1))}$${\ displaystyle \ mathbf {Y} _ {i}}$${\ displaystyle (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} (\ mathbf {X} ^ {\ top} \ mathbf {X}) = \ mathbf {I}}$

${\ displaystyle {\ tilde {\ boldsymbol {\ delta}}} _ {i} = {\ begin {pmatrix} \ mathbf {Y} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X } ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} ( \ mathbf {X} ^ {\ top} \ mathbf {X}) \ mathbf {Y} _ {i} & \ mathbf {Y} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X } ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {X} _ {i} \\\\\ mathbf {X} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {Y} _ {i} & \ mathbf {X} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {X} _ {i} \\\\\ end {pmatrix}} ^ {- 1} \ cdot {\ begin {pmatrix} \ mathbf {Y} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {y} _ {i} \\\\\ mathbf {X} _ {i} ^ {\ top} \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ mathbf {y} _ {i} \\\\\ end {pmatrix}}}$

or.

${\ displaystyle {\ tilde {\ varvec {\ delta}}} _ {i} = {\ begin {pmatrix} {\ hat {\ mathbf {Y}}} _ {i} ^ {\ top} {\ hat { \ mathbf {Y}}} _ {i} & {\ hat {\ mathbf {Y}}} _ {i} ^ {\ top} \ mathbf {X} _ {i} \\\\\ mathbf {X} _ {i} ^ {\ top} {\ hat {\ mathbf {Y}}} _ {i} & \ mathbf {X} _ {i} ^ {\ top} \ mathbf {X} _ {i} \\ \\\ end {pmatrix}} ^ {- 1} \ cdot {\ begin {pmatrix} {\ hat {\ mathbf {Y}}} _ {i} ^ {\ top} \ mathbf {y} _ {i} \\\\\ mathbf {X} _ {i} ^ {\ top} \ mathbf {y} _ {i} \\\\\ end {pmatrix}}}$

Once defined, the two-stage least squares estimator can be specified as follows ${\ displaystyle {\ hat {\ mathbf {Z}}} _ {i} = [{\ hat {\ mathbf {Y}}} _ {i} \; \; \ mathbf {X} _ {i}]}$

${\ displaystyle {\ tilde {\ boldsymbol {\ delta}}} _ {i} = ({\ hat {\ mathbf {Z}}} _ {i} ^ {\ top} {\ hat {\ mathbf {Z} }} _ {i}) ^ {- 1} {\ hat {\ mathbf {Z}}} _ {i} ^ {\ top} \ mathbf {y} _ {i}}$.

## literature

• George G. Judge, R. Carter Hill, W. Griffiths, Helmut Lütkepohl , TC Lee. Introduction to the Theory and Practice of Econometrics. 2nd Edition. John Wiley & Sons, New York / Chichester / Brisbane / Toronto / Singapore 1988, ISBN 0-471-62414-4 .

## Individual evidence

1. George G. Judge, R. Carter Hill, W. Griffiths, Helmut Lütkepohl , TC Lee. Introduction to the Theory and Practice of Econometrics. 2nd Edition. John Wiley & Sons, New York / Chichester / Brisbane / Toronto / Singapore 1988, ISBN 0-471-62414-4 , p. 645.
2. George G. Judge, R. Carter Hill, W. Griffiths, Helmut Lütkepohl, TC Lee. Introduction to the Theory and Practice of Econometrics. 2nd Edition. John Wiley & Sons, New York / Chichester / Brisbane / Toronto / Singapore 1988, ISBN 0-471-62414-4 , p. 645.