# Autocorrelation

The autocorrelation (also cross autocorrelation ) is a term from stochastics and signal processing and describes the correlation of a function or a signal with itself at an earlier point in time. Correlation functions are calculated for sequences of random variables that depend on time . These functions indicate how much similarity the shifted sequence has with the original sequence . Since the unshifted sequence is most similar to itself, the autocorrelation for the unshifted sequence has the highest value. If there is a relationship between the members of the sequence that is more than random, the correlation of the original sequence with the shifted sequence also usually has a value that deviates significantly from zero. One then says that the terms of the sequence are autocorrelated. ${\ displaystyle x (t)}$ ${\ displaystyle t}$ ${\ displaystyle \ tau}$ ${\ displaystyle x (t- \ tau)}$ ${\ displaystyle x (t)}$ ${\ displaystyle (\ tau = 0)}$ ## General

Since the sequence is compared with a shifted version of itself, it is called an autocorrelation. If, on the other hand, two different sequences and are compared, one speaks of a cross-correlation . With the autocorrelation it is possible to determine connections between the observed results at different observation times of a measurement series. The cross-correlation, on the other hand, indicates the correlation between different features as a function of time. ${\ displaystyle x (t)}$ ${\ displaystyle x (t)}$ ${\ displaystyle y (t- \ tau)}$ In signal processing, continuous measurement data is often assumed. One speaks of autocorrelation when the continuous or time-discrete function (e.g. one- or multi-dimensional function over time or place) is correlated with itself, for example with . With the Durbin-Watson test , a sample can be used to check whether a time series or spatial data show an autocorrelation. ${\ displaystyle x (t)}$ ${\ displaystyle x (t + \ tau)}$ The autocorrelation is defined differently in the various disciplines. In statistics it is calculated for stochastic processes as a normalized form of the autocovariance, in signal processing as the convolution of the time-dependent signal with itself. In some areas the terms autocorrelation and autocovariance are also used synonymously. ${\ displaystyle X_ {t}}$ ${\ displaystyle x (t)}$ The autocorrelation can be displayed graphically in a correlogram .

## Autocovariance and Autocorrelation in Stochastics

The autocovariance function describes the covariance between the values ​​of a stochastic process at different times. For a real-valued stochastic process it is defined as: ${\ displaystyle (X_ {t}) _ {t \ in T}}$ ${\ displaystyle \ gamma (t_ {1}, t_ {2}) = \ operatorname {Cov} (X_ {t_ {1}}, X_ {t_ {2}}) = \ operatorname {E} [({X_ { t_ {1}}} - {\ mu _ {t_ {1}}}) ({X_ {t_ {2}}} - {\ mu _ {t_ {2}}})]; \ qquad \ gamma (t_ {1}, t_ {2}) \ in \ mathbb {R}}$ Here denotes the expected value and expected value of . The existence of these expected values ​​is assumed. For a time difference , the autocovariance is identical to the variance . ${\ displaystyle \ operatorname {E} [\ cdot]}$ ${\ displaystyle {\ mu _ {t}}}$ ${\ displaystyle X_ {t}}$ ${\ displaystyle \ tau = 0}$ For a weakly stationary process , the statistical values ​​expected value, standard deviation and variance of the random variable are no longer time-dependent. The autocovariance then does not depend on the position of the points in time, but only on the time difference between and : ${\ displaystyle X}$ ${\ displaystyle \ tau}$ ${\ displaystyle t_ {1}}$ ${\ displaystyle t_ {2}}$ ${\ displaystyle \ gamma _ {\ tau} = \ operatorname {E} \ left [\ left ({X} _ {t} - \ mu \ right) \ left ({X_ {t + \ tau}} - \ mu \ right) \ right].}$ The autocorrelation function of the stochastic process is defined as a normalized autocovariance function:

${\ displaystyle \ rho \ left (t_ {1}, t_ {2} \ right) = {\ frac {\ gamma \ left (t_ {1}, t_ {2} \ right)} {\ sigma _ {t_ { 1}} \ sigma _ {t_ {2}}}} \ qquad {\ mbox {with}} - 1 \ leq \ rho (t_ {1}, t_ {2}) \ leq +1}$ Here mean:
 ${\ displaystyle \ sigma _ {t_ {1}}}$ Standard deviation of${\ displaystyle X_ {t_ {1}}}$ ${\ displaystyle \ sigma _ {t_ {2}}}$ Standard deviation of ${\ displaystyle X_ {t_ {2}}}$ ${\ displaystyle \ rho (t_ {1}, t_ {2})}$ Autocorrelation based on the times and${\ displaystyle t_ {1}}$ ${\ displaystyle t_ {2}}$ In this form, the autocorrelation function has no units and is normalized to the range between −1 and 1.

For a stationary process, the autocovariance is only dependent on the time difference between and . The standard deviation is then independent of the point in time, the product of the standard deviations in the denominator then corresponds to the independent variance . The autocorrelation function for a stationary process is thus simplified to: ${\ displaystyle \ tau}$ ${\ displaystyle t_ {1}}$ ${\ displaystyle t_ {2}}$ ${\ displaystyle t}$ ${\ displaystyle \ sigma _ {X} ^ {2} = \ operatorname {Var} (X_ {t}) = \ operatorname {Var} (X_ {0})}$ ${\ displaystyle \ rho \ left (t_ {1}, t_ {2} \ right) = \ rho _ {\ tau} = {\ frac {\ gamma _ {\ tau}} {\ sigma _ {X} ^ { 2}}} = {\ frac {\ gamma _ {\ tau}} {\ gamma _ {0}}}}$ ,

there applies. ${\ displaystyle \ gamma _ {0} = \ sigma _ {X} ^ {2}}$ ## Autocorrelation in signal processing

The autocorrelation function (AKF) is used here to describe the correlation of a signal with itself in the case of different time shifts between the function values under consideration. The AKF of the signal can be defined symmetrically around the zero point: ${\ displaystyle \ tau}$ ${\ displaystyle \ Psi _ {xx} (\ tau) = \ lim \ limits _ {T \ rightarrow \ infty} {{\ frac {1} {2T}} \ int _ {- T} ^ {T} x ( t) x (t + \ tau) dt}}$ ,

as well as asymmetrical:

${\ displaystyle \ Psi _ {xx} (\ tau) = \ lim \ limits _ {T \ rightarrow \ infty} {{\ frac {1} {T}} \ int _ {0} ^ {T} x (t ) x (t + \ tau) dt}}$ ,

In the latter case, the result would be e.g. B. differ in a Dirac function due to its symmetry. ${\ displaystyle t = 0}$ In short, the operator symbol is used for the autocorrelation : ${\ displaystyle \ star}$ ${\ displaystyle (x \ star x) (\ tau) = \ int _ {- \ infty} ^ {\ infty} x ^ {*} (t) \ x (t + \ tau) \, dt = x ^ {* } (- \ tau) * x (\ tau)}$ with as the complex conjugate function of and the convolution operator . ${\ displaystyle x ^ {*}}$ ${\ displaystyle x}$ ${\ displaystyle *}$ The AKF corresponds to the autocovariance function for mean- value -free , stationary signals. In practice, the autocorrelation function of such signals is usually calculated using the autocovariance function.

For time-discrete signals, the sum is used instead of the integral. With a discrete shift we get: ${\ displaystyle j}$ ${\ displaystyle \ Psi _ {xx} (j) = \ sum _ {n} x_ {n} \, x_ {nj}.}$ In digital signal analysis, the autocorrelation function is usually calculated using the inverse Fourier transform of the car power spectrum (e.g. ): ${\ displaystyle S_ {XX} (f)}$ ${\ displaystyle \ Psi _ {xx} \ left (\ tau \ right) = \ int _ {- \ infty} ^ {\ infty} S_ {XX} (f) \ cdot e ^ {\ mathrm {i} 2 \ pi f \ tau} \, df}$ The theoretical basis of this calculation is the Wiener-Chintschin theorem .

### Impulse ACF

For signals with finite energy content - so-called energy signals - it makes sense to use the following definition:

${\ displaystyle \ Psi _ {xx} ^ {E} (\ tau) = \ int _ {- \ infty} ^ {\ infty} x (t) x (t + \ tau) dt}$ .

## properties

### Straightness

The AKF is an even function:

${\ displaystyle \ Psi _ {xx} (\ tau) = \ Psi _ {xx} (- \ tau)}$ .

### AKF and periodicities

The function underlying a periodic AKF ( ) is itself periodic, as the following proof shows: ${\ displaystyle \ Psi _ {xx} (\ tau) = \ Psi _ {xx} (\ tau + nT)}$ ${\ displaystyle x (t)}$ ${\ displaystyle \ Psi _ {xx} (nT) = {\ int _ {- \ infty} ^ {\ infty} x (t) x (t + nT) dt}}$ ${\ displaystyle \ Psi _ {xx} (0) = {\ int _ {- \ infty} ^ {\ infty} x (t) x (t) dt}}$ ${\ displaystyle \ Rightarrow x (t) = x (t + nT)}$ .

Conversely, it is also true for periodic functions that their AKF is periodic: ${\ displaystyle x (t) = x (t + nT)}$ ${\ displaystyle \ Psi _ {xx} (\ tau)}$ ${\ displaystyle \ Psi _ {xx} (\ tau) = {\ int _ {- \ infty} ^ {\ infty} x (t) x (t + \ tau) dt} = {\ int _ {- \ infty} ^ {\ infty} x (t) x (t + nT + \ tau) dt}}$ ${\ displaystyle \ Rightarrow \ Psi _ {xx} (\ tau) = \ Psi _ {xx} (\ tau + nT)}$ .

It can thus be concluded that a function and its ACF always have the same periodicity:

${\ displaystyle x (t) = x (t + nT) \ Leftrightarrow \ Psi _ {xx} (\ tau) = \ Psi _ {xx} (\ tau + nT)}$ .

If there are repetitions in the signal, maxima of the autocorrelation function result from the time shifts which correspond to the repetition duration of phenomena in the signal. So z. B. hidden periodic components and echo phenomena can be detected in signals.

### maximum

Regardless of its definition, the AKF has at its maximum: ${\ displaystyle \ tau = 0}$ ${\ displaystyle | \ Psi _ {xx} (\ tau) | \ leq \ Psi _ {xx} (0)}$ For the AKF this value is referred to as the root mean square, for the pulse AKF it is called the signal energy.

Often the autocorrelation function is also given to the maximum value with normalized: ${\ displaystyle \ tau = 0}$ ${\ displaystyle \ rho _ {xx} \ left (\ tau \ right) = {\ frac {\ Psi _ {xx} (\ tau)} {\ Psi _ {xx} (0)}}}$ The magnitude of this normalized autocorrelation function can assume values ​​between 0 and 1. One also speaks of the temporal autocorrelation coefficient of a random variable with the temporally shifted random variable . ${\ displaystyle X_ {t}}$ ${\ displaystyle X_ {t + \ tau}}$ ### Waste behavior

For large times and not even periodic functions x we ​​have: ${\ displaystyle \ tau \ rightarrow \ infty}$ ${\ displaystyle \ lim \ limits _ {\ tau \ to \ infty} \ Psi _ {xx} (\ tau) = 0}$ .

## Examples

### example 1

The functions in the adjacent figure are composed of sinusoidal sections of uniform frequency. Phase jumps occur at the joints. To calculate the correlation, multiply both signal values point by point and add the products over a longer period of time. With the delay Δs shown, all individual products in the red marked areas are positive or zero, in the areas in between they are mostly negative. Only for Δs = 0 are all individual products positive, the correlation function reaches its maximum value.

Side note: If you add both signals, piecewise constructive or destructive interference can occur.

### Example 2

In optical coherence tomography , light with a particularly short coherence length is used because the autocorrelation only delivers a result that deviates noticeably from zero if the length of the measuring arm and reference arm match well. If the deviation is greater, the results of the autocorrelation vary by zero ( white light interferometry ).

## estimate

Analogous to the sample covariance and sample correlation , the sample autocovariance or the sample autocorrelation can also be determined. If the data of a stationary time series is available, the uncorrected acyclic sample autocovariance is usually carried out ${\ displaystyle x_ {1}, x_ {2}, \ ldots, x_ {T}}$ ${\ displaystyle {\ hat {\ gamma}} _ {\ tau} = {\ frac {1} {T}} \ sum _ {i = 1} ^ {T- \ tau} (x_ {i + \ tau} - {\ bar {x}}) (x_ {i} - {\ bar {x}}), \ quad \ tau = 0.1, \ ldots}$ estimated, being . Note the convention of dividing the sum by instead of by , in order to guarantee that the sequence of the sample autocovariances is positive semidefinite . For one gets the uncorrected sample variance of the data. ${\ displaystyle \ textstyle {\ bar {x}} = {\ frac {1} {T}} \ sum _ {i = 1} ^ {T} x_ {i}}$ ${\ displaystyle T}$ ${\ displaystyle T- \ tau}$ ${\ displaystyle \ tau = 0}$ The sample autocorrelation is then given by

${\ displaystyle {\ hat {\ rho}} _ {\ tau} = {\ frac {{\ hat {\ gamma}} _ {\ tau}} {{\ hat {\ gamma}} _ {0}}} = {\ frac {\ sum _ {i = 1} ^ {T - {\ tau}} (x_ {i + \ tau} - {\ bar {x}}) (x_ {i} - {\ bar {x} })} {\ sum _ {i = 1} ^ {T} (x_ {i} - {\ bar {x}}) ^ {2}}}, \ quad \ tau = 0.1, \ ldots}$ with . The standard errors of sample autocorrelations are usually calculated using the Bartlett formula (see also: Correlogram ). ${\ displaystyle {\ hat {\ rho}} _ {0} = 1}$ Instead, to calculate the undistorted acyclic sample autocorrelation, one divides by : ${\ displaystyle T- \ tau}$ ${\ displaystyle {\ hat {\ gamma}} _ {\ tau} = {\ frac {1} {T- \ tau}} \ sum _ {i = 1} ^ {T- \ tau} (x_ {i + \ tau} - {\ bar {x}}) (x_ {i} - {\ bar {x}}), \ quad \ tau = 0.1, \ ldots}$ The undistorted acyclic sample correlation can be calculated more quickly in Fourier space on modern computers (see also Wiener-Chintschin theorem ) by lengthening the signal with zeros ("zero padding"). The added zeros mean that the cyclic sample correlation is not calculated (which assumes a periodic signal), but the acyclic sample correlation:

${\ displaystyle {\ hat {\ gamma}} _ {\ tau} = {\ frac {1} {T- \ tau}} \ mathrm {IDFT _ {\ tau} (DFT (Zeropad (x - {\ bar {x }})) DFT (Zeropad (x - {\ bar {x}})))}}$ ## Applications

The autocorrelation is used u. a. in regression analysis , time series analysis and image processing . For example, in regression analysis, the disturbance variables, i.e. the deviations of the observed values ​​from the true regression line, are interpreted as a sequence of identically distributed random variables. In order for the regression analysis to deliver meaningful results, the disturbance variables must be uncorrelated. In time series analysis, the autocorrelation function is often used together with the partial autocorrelation function to identify ARMA models .

### Finding signal periods

A common application of the autocorrelation function is to find periodicities in highly noisy signals that are not readily apparent:

• The autocorrelation function of a periodic signal is again a periodic signal with the same period . For example, the autocorrelation function of a cosine signal
${\ displaystyle x (t) = {\ hat {x}} \ cos (\ omega t + \ varphi)}$ again a cosine function with the same angular frequency (preservation of the signal period). ${\ displaystyle \ omega}$ ${\ displaystyle R_ {xx} (\ tau) = {\ frac {{\ hat {x}} ^ {2}} {2}} \ cos (\ omega \ tau)}$ ,
However, the phase information has been lost here .
An equivalent way of finding the signal period is the possibility of examining the Fourier spectrum of the signal for a dominant frequency. Since the autocorrelation is the normalized Fourier transform of the power density spectrum (according to the Wiener-Khinchine theorem ), both approaches are equivalent.
• Since white noise at one point in time is completely independent of white noise at another point in time, the autocorrelation function of white noise gives a Dirac pulse at that point . If there is white noise of the power density for the frequencies , the following applies: In the case of colored noise, which occurs in technical systems mostly instead of white noise, there is also an absolute maximum of the autocorrelation function and a decrease of the autocorrelation function for shifts . The width of this maximum is determined by the "color" of the noise.${\ displaystyle \ tau = 0}$ ${\ displaystyle S_ {0}}$ ${\ displaystyle \ omega = - \ infty \ ldots + \ infty}$ ${\ displaystyle R_ {xx} (\ tau) = S_ {0} \ delta (\ tau) \,}$ ${\ displaystyle \ tau = 0}$ ${\ displaystyle | \ tau |> 0}$ When analyzing periodicities, only the autocorrelation function for large values ​​of is considered and the range around is ignored, since it primarily contains information about the strength of the noise signal. ${\ displaystyle \ tau}$ ${\ displaystyle \ tau = 0}$ ### Signal to noise ratio

Since the value of the autocorrelation function corresponds to the root mean square value (for power signals) or the signal energy (for energy signals), it is relatively easy to estimate the signal-to-noise ratio by forming the autocorrelation function . ${\ displaystyle \ tau = 0}$ To do this, one divides the amount of the value , i. H. the value that the autocorrelation function would have without noise at position 0, through the height of the "noise peak". When converting the signal-to-noise ratio S x / N x into decibels , one must make sure that and are not used. This is because the autocorrelation function at position 0 represents a power or energy variable (square size) and not a field size. ${\ displaystyle \ lim \ limits _ {\ tau \ to 0} R_ {xx} (\ tau)}$ ${\ displaystyle 10 \ cdot \ log \ left ({\ tfrac {S_ {x}} {N_ {x}}} \ right)}$ ${\ displaystyle 20 \ cdot \ log \ left ({\ tfrac {S_ {x}} {N_ {x}}} \ right)}$ 