# Mahalanobis distance

The Mahalanobis distance, also called Mahalanobis distance , (after Prasanta Chandra Mahalanobis ) is a distance measure between points in a multidimensional vector space . Intuitively, the Mahalanobis distance between two points indicates their distance in standard deviations. The Mahalanobis distance is used specifically in statistics , for example in connection with multivariate methods .

## definition

With multivariate distributions , the coordinates of a point are represented as a -dimensional column vector . It is understood as the realization of a random vector with the covariance matrix . The distance between two points distributed in this way and is then given by the Mahalanobis distance in the population${\ displaystyle m}$${\ displaystyle m}$ ${\ displaystyle \ mathbf {X}}$ ${\ displaystyle \ mathbf {\ Sigma}}$${\ displaystyle \ mathbf {x}}$${\ displaystyle \ mathbf {y}}$

${\ displaystyle \ Delta (\ mathbf {x}, \ mathbf {y}) = {\ sqrt {(\ mathbf {x} - \ mathbf {y}) ^ {\ top} \ mathbf {\ Sigma} ^ {- 1} (\ mathbf {x} - \ mathbf {y})}}}$

certainly. The Mahalanobis distance is scale and translation invariant .

Similarly, the following applies to the Mahalanobis distance in the sample:

${\ displaystyle D (\ mathbf {x}, \ mathbf {y}) = {\ sqrt {(\ mathbf {x} - \ mathbf {y}) ^ {\ top} \ mathbf {\ mathbf {S}} ^ {-1} (\ mathbf {x} - \ mathbf {y})}}}$,

where represents the inverse of the sample covariance matrix . ${\ displaystyle \ mathbf {S} ^ {- 1}}$

In the two-dimensional, the points with the same Mahalanobis distance from a center graphically form an ellipse (the axes of which do not necessarily point in the direction of the coordinate axes), while in the Euclidean distance it is a circle. If the covariance matrix is ​​the identity matrix (this is the case if the individual components of the random vector are uncorrelated in pairs and each have a variance of 1), then the Mahalanobis distance corresponds to the Euclidean distance. The surfaces of constant distance from a point can be any conic sections for the Mahalanobis distance . ${\ displaystyle \ mathbf {X}}$

Mathematically, the Mahalanobis distance results from the -dimensional normal distribution with the expected value vector and covariance matrix , where applies. This distribution has the density ${\ displaystyle m}$ ${\ displaystyle {\ boldsymbol {\ mu}}}$ ${\ displaystyle \ mathbf {\ Sigma}}$${\ displaystyle \ det (\ mathbf {\ Sigma}) \ neq 0}$

${\ displaystyle f_ {X} (\ mathbf {x}) = {\ frac {1} {(2 \ pi) ^ {\ frac {m} {2}} {\ sqrt {| \ det (\ mathbf {\ Sigma}) |}}}} \ cdot \ exp \ left (- {\ frac {1} {2}} (\ mathbf {x} - {\ boldsymbol {\ mu}}) ^ {\ top} \ mathbf { \ Sigma} ^ {- 1} (\ mathbf {x} - {\ boldsymbol {\ mu}}) \ right).}$

Taking the logarithm of this expression gives the log density

${\ displaystyle \ log f_ {X} (\ mathbf {x}) = - {\ frac {1} {2}} (\ mathbf {x} - {\ boldsymbol {\ mu}}) ^ {\ top} \ mathbf {\ Sigma} ^ {- 1} (\ mathbf {x} - {\ boldsymbol {\ mu}}) - c}$

with a constant , which, apart from the missing root, the prefactor and the summands, corresponds to the Mahalanobis distance. ${\ displaystyle c}$${\ displaystyle c}$

## Applications

In discriminant analysis , the assignment of a point to a certain given population is determined, among other things, with the Mahalanobis distance. Another area of ​​application is the detection of outliers with the help of the Mahalanobis distance, where the point is replaced by a (robust) position parameter. It should be noted critically that both the covariance matrix and the location parameters can be distorted by outliers. In most cases, they are estimated by robust methods, e.g. B. with the help of the MCD estimator ( MCD English for Minimum Covariance Determinant, German for example estimator with the smallest possible determinant of the covariance matrix ). Furthermore, when using the Mahalanobis distance as a distance classifier, two cases can be distinguished: ${\ displaystyle \ mathbf {y}}$

1. The covariance matrix is ​​the same or averaged for all classes.
2. Different covariance matrices are used for the individual classes.

The decision for an alternative is to be justified by empirical analyzes.