# Number of degrees of freedom (statistics)

In the statistic that gives the number of degrees of freedom ( englisch number of degrees of freedom , in short df or dof ) how many values (more precisely, in a calculation formula statistics ) may vary freely.

Estimates of statistical parameters can be based on different amounts of information or data. The number of independent pieces of information that goes into estimating a parameter is called the number of degrees of freedom. In general, the degrees of freedom of an estimation of a parameter are equal to the number of independent pieces of information that flow into the estimation, minus the number of parameters to be estimated, which are used as intermediate steps in the estimation of the parameter itself. For example, values ​​are included in the calculation of the sample variance . Nevertheless, the number of degrees of freedom is because the mean value is estimated as an intermediate step and one degree of freedom is lost. ${\ displaystyle n}$${\ displaystyle n-1}$

## definition

The number of independent observation values ​​minus the number of parameters that can be estimated is called the number of degrees of freedom . Since there are parameters with slope parameters and a level parameter in a multiple linear regression model , one can write ${\ displaystyle n}$${\ displaystyle p}$${\ displaystyle fg}$ ${\ displaystyle p = (k + 1)}$${\ displaystyle k}$${\ displaystyle \ beta _ {1}, \ beta _ {2}, \ ldots, \ beta _ {k}}$${\ displaystyle \ beta _ {0}}$

${\ displaystyle fg = np = n- (k + 1) = (\ mathrm {number \; of \; observations}) - (\ mathrm {number \; of \; history {\ ddot {a}} tzten \; Parameters})}$.

The degrees of freedom can also be interpreted as the number of "superfluous" measurements that are not required to determine the parameters .

The degrees of freedom are needed when estimating variances . In addition, different probability distributions with which hypothesis tests are performed on the sample depend on the degrees of freedom.

## Examples

### At the expected value of the residual sum of squares

The residual sum of squares is used to estimate the disturbance variable variance

${\ displaystyle \ mathrm {RSS} = \ sum _ {i = 1} ^ {n} {\ hat {\ varepsilon}} _ {i} ^ {2} = {\ hat {\ boldsymbol {\ varepsilon}}} ^ {\ top} {\ hat {\ boldsymbol {\ varepsilon}}}}$

needed. The unbiased estimator for the disturbance variance is in the multiple linear regression model

${\ displaystyle {\ hat {\ sigma}} ^ {2} = {\ frac {\ left (\ mathbf {y} - \ mathbf {X} \ mathbf {b} \ right) ^ {\ top} \ left ( \ mathbf {y} - \ mathbf {X} \ mathbf {b} \ right)} {np}} = {\ frac {{\ hat {\ boldsymbol {\ varepsilon}}} ^ {\ top} {\ hat { \ boldsymbol {\ varepsilon}}}} {np}}}$,

there . The residual sum of squares has degrees of freedom corresponding to the number of independent residuals. The expected value of the sum of squares of the residuals is given by, based on the formula for the unambiguous disturbance variable variance ${\ displaystyle \ operatorname {E} ({\ hat {\ sigma}} ^ {2}) = \ sigma ^ {2}}$${\ displaystyle (np)}$

${\ displaystyle \ operatorname {E} ({\ hat {\ sigma}} ^ {2}) = \ sigma ^ {2} \ Longleftrightarrow \ operatorname {E} ({\ hat {\ boldsymbol {\ varepsilon}}} ^ {\ top} {\ hat {\ boldsymbol {\ varepsilon}}}) = (np) \ sigma ^ {2}}$.

In order to be able to find out intuitively why the adaptation of the degrees of freedom is necessary, one can consider the first-order conditions for the KQ estimator. These can be used as

${\ displaystyle \ textstyle \ sum \ nolimits _ {i = 1} ^ {n} {\ hat {\ varepsilon}} _ {i} = 0}$

and

${\ displaystyle \ textstyle \ sum \ nolimits _ {i = 1} ^ {n} x_ {ij} {\ hat {\ varepsilon}} _ {i} = 0, \; j = 1, \ ldots, k}$

be expressed. When obtaining the KQ estimator, restrictions are thus imposed on the KQ residuals . This means that for given residuals the remaining residuals are known: In the residuals there are consequently only degrees of freedom (in contrast to this, there are n degrees of freedom in the sample in the true disturbance variables .) ${\ displaystyle k + 1}$${\ displaystyle n- (k + 1)}$${\ displaystyle (k + 1)}$${\ displaystyle n- (k + 1)}$${\ displaystyle \ varepsilon _ {i}}$

A biased estimate that doesn't take into account the number of degrees of freedom is size

${\ displaystyle {\ hat {\ sigma}} ^ {2} = {\ frac {1} {n}} \ sum _ {i = 1} ^ {n} {\ hat {e}} _ {i} ^ {2} = {\ frac {{\ hat {\ varepsilon {\ varepsilon}}} ^ {\ top} {\ hat {\ varepsilon}}}} {n}}}$.

The estimator is obtained by using the maximum likelihood estimation .

### At the empirical variance

For an unbiased estimate of the population variance, the sum of squares is divided by the number of degrees of freedom to obtain the sample variance (estimator)${\ displaystyle X}$

${\ displaystyle S ^ {2} = {\ frac {1} {n-1}} \ sum _ {i = 1} ^ {n} (X_ {i} - {\ overline {X}}) ^ {2 }}$.

Since this variance is unbiased, it holds true . The empirical counterpart to this variance is the empirical variance ${\ displaystyle \ operatorname {E} (S ^ {2}) = \ sigma ^ {2}}$

${\ displaystyle s ^ {2}: = {\ frac {1} {n-1}} \ sum \ limits _ {i = 1} ^ {n} \ left (x_ {i} - {\ overline {x} } \ right) ^ {2}}$

In the case of the empirical variance, the averaging can intuitively be explained by instead of by in the modified form of the empirical variance as follows: Due to the focus property of the empirical mean , the last deviation is already determined by the first . Consequently, only deviations vary freely and one therefore averages by dividing by the number of degrees of freedom . ${\ displaystyle (n-1)}$${\ displaystyle n}$${\ displaystyle \ sum \ nolimits _ {i = 1} ^ {n} \ left (x_ {i} - {\ bar {x}} \ right) = 0}$${\ displaystyle \ left (x_ {n} - {\ overline {x}} \ right)}$${\ displaystyle (n-1)}$${\ displaystyle (n-1)}$${\ displaystyle (n-1)}$

### Number of degrees of freedom of major sums of squares

The following table of analysis of variance shows the number of degrees of freedom of some important sums of squares in the multiple linear regression model : ${\ displaystyle y_ {i} = \ beta _ {0} + x_ {i1} \ beta _ {1} + x_ {i2} \ beta _ {2} + \ dotsc + x_ {ik} \ beta _ {k} + \ varepsilon _ {i}, \ quad i = 1, \ ldots, n}$

Source of variation Squared deviations Number of degrees of freedom mean squares of deviation
Regression ${\ displaystyle \ displaystyle \ sum \ nolimits _ {i = 1} ^ {n} ({\ hat {y}} _ {i} - {\ overline {\ hat {y}}}) ^ {2}}$ ${\ displaystyle k}$
Residual ${\ displaystyle \ displaystyle \ sum \ nolimits _ {i = 1} ^ {n} \ left (y_ {i} - {\ hat {y}} _ {i} \ right) ^ {2}}$ ${\ displaystyle (np)}$ ${\ displaystyle \ displaystyle \ sum \ nolimits _ {i = 1} ^ {n} \ left (y_ {i} - {\ hat {y}} _ {i} \ right) ^ {2} / (np) = {\ hat {\ sigma}} ^ {2}}$
Total ${\ displaystyle \ displaystyle \ sum \ nolimits _ {i = 1} ^ {n} \ left (y_ {i} - {\ overline {y}} \ right) ^ {2}}$ ${\ displaystyle (n-1)}$ ${\ displaystyle \ displaystyle \ sum \ nolimits _ {i = 1} ^ {n} \ left (y_ {i} - {\ overline {y}} \ right) ^ {2} / (n-1) = s_ { y} ^ {2}}$

These sums of squares play a major role in calculating the coefficient of determination.

## Degrees of freedom as parameters of distributions

The number of degrees of freedom is also a parameter of several distributions. If the observations are normally distributed , then the quotient follows from the sum of the squares of the residuals and the confounding variance of a chi-square distribution with degrees of freedom: ${\ displaystyle {\ text {RSS}}}$${\ displaystyle \ sigma ^ {2}}$${\ displaystyle np}$

${\ displaystyle {\ frac {\ text {RSS}} {\ sigma ^ {2}}} = {\ frac {{\ hat {\ varepsilon}}} ^ {\ top} {\ hat {\ varepsilon {\ varepsilon}}}} {\ sigma ^ {2}}} = {\ frac {{\ boldsymbol {\ varepsilon}} ^ {\ top} \ left (\ mathbf {I} _ {n} - \ mathbf { X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ right) {\ boldsymbol {\ varepsilon}}} {\ sigma ^ {2}}} \ sim \ chi ^ {2} (np)}$.

The size follows a chi-square distribution with degrees of freedom, because the number of degrees of freedom of the chi-square distribution corresponds to the trace of the projection matrix , i.e. ${\ displaystyle {\ text {RSS}} / \ sigma ^ {2}}$${\ displaystyle np}$ ${\ displaystyle \ left (\ mathbf {I} _ {n} - \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ { \ top} \ right)}$

${\ displaystyle {\ frac {{\ boldsymbol {\ varepsilon}} ^ {\ top} \ left (\ mathbf {I} - \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X} ) ^ {- 1} \ mathbf {X} ^ {\ top} \ right) {\ boldsymbol {\ varepsilon}}} {\ sigma ^ {2}}} \ sim \ chi ^ {2} (\ operatorname {trace } \ left (\ mathbf {I} _ {n} - \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top } \ right)}$

The following applies to the trace of . Further distributions that depend on the number of degrees of freedom are the t -distribution and the F -distribution . These distributions are required for the estimation of confidence intervals of the parameters and for hypothesis tests. ${\ displaystyle \ left (\ mathbf {I} _ {n} - \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ { \ top} \ right)}$${\ displaystyle \ operatorname {Spur} \ left (\ mathbf {I} _ {n} - \ mathbf {X} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} \ mathbf {X} ^ {\ top} \ right) = np}$

Another important quantity that is required for statistical inference and the distribution of which depends on degrees of freedom is the t statistic. You can show that the size

${\ displaystyle {\ frac {\ frac {{\ varvec {R}} _ {1} {\ varvec {\ hat {\ beta}}} - {\ varvec {R}} _ {1} {\ varvec {\ beta}}} {\ sqrt {\ sigma ^ {2} {\ varvec {R}} _ {1} (\ mathbf {X} ^ {\ top} \ mathbf {X}) ^ {- 1} {\ varvec {R}} _ {1} ^ {\ top}}}} {\ sqrt {\ frac {(np) {\ hat {\ sigma}} ^ {2}} {\ sigma ^ {2} (np)} }}} = {\ frac {{\ mathcal {N}} (0; 1)} {\ sqrt {\ frac {\ chi _ {n} ^ {2}} {n}}}} \; \; { \ stackrel {H_ {0}} {\ sim}} \; \; {\ mathcal {t}} (np)}$

follows a t-distribution with degrees of freedom (see Testing General Linear Hypotheses ). ${\ displaystyle (tk)}$

## Individual evidence

1. Berhold Witte, Hubert Schmidt: Surveying and basics of statistics for the construction industry. 2nd Edition. Wittwer, Stuttgart 1989, ISBN 3-87919-149-2 , p. 59.
2. Fahrmeir, L .; Artist, R .; Pigeot, I .; Tutz, G .: Statistics. The way to data analysis. 8th edition, p. 65
3. ^ William H. Greene: Econometric Analysis. 5th edition. Prentice Hall International, 2002, ISBN 0-13-110849-2 , p. 33.
4. Karl-Rudolf Koch : Parameter estimation and hypothesis tests. 3. Edition. Dümmler, Bonn 1997, ISBN 3-427-78923-3 .