Declared Sum of Squares

from Wikipedia, the free encyclopedia
This graphic shows the sum of squares decomposition , i.e. H. the decomposition of the total square sum into the explained square sum and the residual square sum . The sum of the green squares is the explained sum of squares.

In the statistics the (by which is regression ) explained sum of squares , and declared deviation square sum , short SQE for S umme the Q of uadrate E rklärten deviations ( English sum of squared explained 'Deviations shortly SSE or explained' sum of squares , short ESS ) S umme the A bweichungs q uadrate of values , in short , or SAQ explained , often model sum of squares or regression sum of squares , the square sum of the estimates or regress values. It is calculated as the sum of the squares of the centered estimated values ​​and can be interpreted as the “total variation of the estimated values ” (“explained variation”). There is no international agreement on the exact name and its abbreviations. In German-language literature, the German name is sometimes used with English abbreviations.

definition

The explained (deviation) sum of squares or regression sum of squares is defined as the sum of squares of the deviations explained by the regression function :

Sometimes there is also the abbreviation or . This term, however, can easily be with the " residual sum " ( English sum of squared residuals be confused) also with abbreviated.

If the underlying linear model contains an absolute term other than zero , the empirical mean value of the estimated values agrees with that of the observed measured values , i.e. (for a proof in the multiple case, see coefficient of determination # matrix notation ). The explained sum of squares measures the spread of the estimated values around their mean . The ratio of the sum of squares explained by the regression to the total sum of squares is called the coefficient of determination of the regression.

Simple linear regression

In simple linear regression (model with only one explanatory variable) the explained sum of squares can also be expressed as follows:

.

Here, the represent the predicted values ​​and are the estimate of the absolute term and the estimate of the slope parameter . From this notation it can be seen that the explained sum of squares can also be represented as the product of the square of the Bravais-Pearson correlation coefficient and the  total sum of squares :

,

where the least squares estimator for the slope is the quotient of the product sum of and and the square sum of . To show this, it must first be shown that if the underlying linear model contains an absolute term other than zero , the empirical mean value of the estimated values corresponds to that of the observed measured values . This is true because of

and therefore

,

where the last step follows from the fact that it can also be written as:

.

By the sum of squares decomposition or may be obtained by replacing by in this way also the following representation for the residual sum find:

.

Matrix notation

In matrix notation , the explained sum of squares can be expressed as follows

.

Here is a vector with the elements and is defined by , where represents the least squares estimation vector and the data matrix.

Individual evidence

  1. a b Jeffrey Marc Wooldridge : Introductory econometrics: A modern approach. 4th edition. Nelson Education, 2015, p. 39.
  2. Moosmüller, Gertrud. Methods of empirical economic research. Pearson Deutschland GmbH, 2008. p. 239.
  3. Werner Timischl : Applied Statistics. An introduction for biologists and medical professionals. 2013, 3rd edition, p. 315.
  4. ^ Ludwig Fahrmeir , Rita artist, Iris Pigeot , Gerhard Tutz : Statistics. The way to data analysis. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2016, ISBN 978-3-662-50371-3 , p. 151.