# Fast folding

The fast convolution is an algorithm for the calculation of the discrete, aperiodic convolution operation with the help of the fast Fourier transform (FFT). The computationally intensive aperiodic convolution operation in the time domain is replaced by a much simpler, functionally equivalent, multiplication in the frequency domain . Fast folding is used in the field of digital signal processing and the like. a. with non-recursive digital filters ( FIR filters ) to reduce the calculation effort.

## From discrete convolution to fast discrete convolution

In the time domain , the discrete convolution is defined as:

${\ displaystyle h (n) = (f * g) (n) = \ sum _ {k} f (k) \ cdot g (nk)}$

If the discrete convolution is discretely Fourier-transformed into the frequency domain , the following term results:

${\ displaystyle H (N) = {\ mathcal {F}} (h (n)) = {\ mathcal {F}} (f (n)) \ cdot {\ mathcal {F}} (g (n)) }$

${\ displaystyle H (N)}$is calculated and then transformed back into the time domain , resulting in the function you are looking for:

${\ displaystyle h (n) = {\ mathcal {F}} ^ {- 1} (H (N)) = (f * g) (n)}$.

## application

The areas of application of fast convolution primarily include filter applications in the area of non-recursive filters (FIR filters). The methods used here are called the overlap save method and overlap add method . Since the fast convolution using FFT and its grouping of the data in blocks is a cyclic convolution , but on the other hand FIR filters work with the aperiodic convolution in the time domain, procedures are necessary to equate the periodic with the aperiodic convolution to the data to "correct" in the overlapping areas between the data blocks. This is where the name overlap in the fast folding process is derived .

## Overlap failure

In the following, the effect of the overlap in fast (cyclic) convolution is discussed, and in which cases the result is identical to the discrete linear convolution . ${\ displaystyle h [n]}$

If the input sequence of length  K is folded with the impulse response of length  G , the linear convolution results in output values . ${\ displaystyle f [n]}$ ${\ displaystyle g [n]}$${\ displaystyle N = K + G-1}$${\ displaystyle h [n]}$

In order to be able to carry out the cyclic convolution via the DFT at all, the input sequence and impulse response must be of the same length. If this is not the case, the shorter vector must be balanced by adding zeros ( zero-padding ).

To simplify the following consideration, we assume that the impulse response is shorter than the input sequence ( ). ${\ displaystyle g [n]}$${\ displaystyle f [n]}$${\ displaystyle G

Since the inverse transformation using IFFT from the spectrum in this case delivers as a convolution product instead of also just K samples (see properties of the DFT), there is an overlap error at the first positions in the output sequence . In addition, it is too short, as the missing values add to the first. The error can be corrected by adding at least zeros to and zeros to (additional zeros at the end do not disturb the DFT algorithm , if the length is a power of two, some FFT implementations work much more efficiently). ${\ displaystyle N = K + G-1}$${\ displaystyle G-1}$${\ displaystyle G-1}$ ${\ displaystyle G-1}$${\ displaystyle f [n]}$${\ displaystyle K-1}$${\ displaystyle g [n]}$

With the help of zero padding , both vectors now have elements; the result of the fast convolution also provides values. As with linear convolution, there is no longer any overlap error. ${\ displaystyle N = K + G-1}$${\ displaystyle h [n]}$${\ displaystyle N = K + G-1}$

The use of fast convolution with the help of the FFT leads to a reduction in the computational effort , so that it no longer depends proportionally on the length (number of values) K of the function for each value (each sample) , but only logarithmically on the selected block size  N. , with the constraint that N must be greater than K , with which the FFT is carried out. ${\ displaystyle g (k)}$

For a set of N values ​​(samples) the following complexities result (measured as the number of basic arithmetic operations such as additions and multiplications):

• discrete folding
${\ displaystyle O (N \ cdot K)}$
• fast discrete folding
${\ displaystyle O (N \ cdot \ mathrm {log} (N))}$if .${\ displaystyle N> K}$

FIR filters implemented in practice can usually be implemented more efficiently from an order of approx. 20 to 40 using fast convolution than in the direct form.