Orthogonal regression

from Wikipedia, the free encyclopedia
Racine carrée bleue.svg
This item has been on the quality assurance side of the portal mathematics entered. This is done in order to bring the quality of the mathematics articles to an acceptable level .

Please help fix the shortcomings in this article and please join the discussion !  ( Enter article )

Orthogonal regression. The red lines represent the distances of the measured value pairs from the best-fit straight line.

In the statistics which serves orthogonal regression (more precisely orthogonal linear regression) for calculating a best-fit line for a finite amount metric scaled data pairs by the method of least squares . As in other regression models , the sum of the squared distances from the straight line is minimized. In contrast to other forms of linear regression , orthogonal regression does not use the distances in or direction, but the orthogonal distances. This procedure does not distinguish between an independent and a dependent variable. In this way, unlike linear regression, applications can be handled in which both variables and are subject to measurement errors.

Orthogonal regression is an important special case of Deming regression . It was first used in 1840 in connection with a geodetic problem by Julius Weisbach , introduced into statistics in 1878 by Robert James Adcock and made known in a more general context by WE Deming in 1943 for technical and economic applications.

Calculation path

It will be a straight line

sought that minimizes the sum of the squared distances from the associated base points on the straight line. Because of this , one calculates these squared distances , the sum of which is to be minimized:

The following auxiliary values ​​are required for the further calculation:

    ( arithmetic mean of )
    (arithmetic mean of )
    ( Sample variance of )
    (Sample variance of )
    ( Sample covariance of )

This gives the parameters for solving the minimization problem:

The coordinates of the base points are also calculated

Alternative calculation method

Distance di of a point P (xi; yi) to the straight line y = mx + t

The geometric distance between a measuring point and a best-fit straight line

can be calculated as follows:

We are now looking for the coefficients and with the smallest sum of the error squares.

Calculation of the partial derivative according to

the equation

results as a solution

In this case, as the mean value of the measurement points designated coordinates. Analogous to this is the mean value of the coordinates of the measuring points. This solution also means that the point always lies on the best-fit straight line.

Calculation of the partial derivative according to

the equation

gives the following quadratic equation:

Are there

and

the sums of squares of the measured values ​​of and and

the product sum between and .

Due to the gradient behavior of this parabola, there is one solution for the minimum:

The equation of the geometric best-fit line is thus:

example

f (x) = 0.8 (x - 3.3) + 4.1
P1 1.0 2.0 −2.3 −2.1 5.29 4.83 4.41
P2 2.0 3.5 −1.3 −0.6 1.69 0.78 0.36
P3 4.0 5.0 0.7 0.9 0.49 0.63 0.81
P4 4.5 4.5 1.2 0.4 1.44 0.48 0.16
P5 5.0 5.5 1.7 1.4 2.89 2.38 1.96
total
Average

It results and the geometric best fit line is therefore as follows:

Individual evidence

  1. J. Weisbach: Determination of the main strike and main fall of deposits . In: Archives for Mineralogy, Geognosy, Mining and Metallurgy . 14, 1840, pp. 159-174.
  2. D. Stoyan, T. Morel: Julius Weisbach's pioneering contribution to orthogonal linear regression . In: Historia Mathematica . 45, 2018, pp. 75-84.
  3. ^ RJ Adcock: A problem in least squares . In: Annals of Mathematics (Ed.): The Analyst . 5, No. 2, 1878, pp. 53-54. JSTOR 2635758 . doi : 10.2307 / 2635758 .
  4. ^ WE Deming: Statistical adjustment of data . Wiley, NY (Dover Publications edition, 1985), 1943, ISBN 0-486-64685-8 .
  5. ^ P. Glaister: Least squares revisited . The Mathematical Gazette . Vol. 85 (2001), pp. 104-107.
  6. Casella, G., Berger, RL: Statistical Inference . 2nd Edition. Cengage Learning, Boston 2008, ISBN 978-0495391876
  7. J. Hedderich, Lothar Sachs : Applied statistics. Toolkit with R . 15th edition. Springer Berlin, Heidelberg 2015, ISBN 978-3662456903