# Fast Fourier transform

Time-based representation (top) and frequency-based representation (bottom) of the same signal, whereby the lower representation can be obtained from the upper representation by Fourier transformation.

The fast Fourier transform ( English fast Fourier transform , therefore mostly abbreviated to FFT ) is an algorithm for the efficient calculation of the discrete Fourier transform (DFT). With it, a time-discrete signal can be broken down into its frequency components and thus analyzed.

Similarly, there is the inverse fast Fourier transformation (IFFT) for the discrete inverse Fourier transformation . The same algorithms are used in IFFT, but with conjugate coefficients.

The FFT has numerous applications (see also below ) in engineering , natural sciences, and applied mathematics . It is also used in mobile radio technologies such as UMTS and LTE and in wireless data transmission, for example in WLAN radio network technology.

The FFT belongs to the divide-and-conquer method , so that - in contrast to direct calculation - previously calculated intermediate results can be reused and arithmetic operations can be saved. The most famous method is attributed to James Cooley and John W. Tukey , who published it in 1965. Strictly speaking, a form of the algorithm was designed by Carl Friedrich Gauß as early as 1805 , who used it to calculate the trajectories of the asteroids (2) Pallas and (3) Juno . A variant of the algorithm by Carl Runge was published for the first time in 1903 and 1905. In addition, restricted forms of the algorithm were developed several times before Cooley and Tukey; B. by Irving John Good (1960). After Cooley and Tukey, there have also been numerous suggestions for improvement and variations, for example from Georg Bruun , CM Rader and Leo I. Bluestein .

## Informal description of the algorithm (Cooley and Tukey)

Cooley and Tukey's algorithm is a classic divide-and-conquer method. The prerequisite for its use is that the number of support points or sampling points is a power of two.

The algorithm is based on the observation that the calculation of a DFT of size 2n can be broken down into two calculations of a DFT of size n (via the vector with the entries of the even and the odd indices), with the two partial results reappearing after the transformation a Fourier transformation of size 2n are to be summarized.

Since the calculation of a DFT of half the length only requires a quarter of the complex multiplications and additions of the original DFT and, depending on the length of the output vector, this rule can be applied several times in a row, the recursive application of this basic idea finally allows a calculation in time; To save on trigonometric arithmetic operations, the properties of the unit roots from the Fourier matrix can also be used with the FFT . ${\ displaystyle {\ mathcal {O}} (n \ log n)}$

## Cooley and Tukey's algorithm

The discrete Fourier transform (DFT) of a vector of dimension is: ${\ displaystyle (x_ {0}, \ dotsc, x_ {2n-1})}$${\ displaystyle 2n}$

${\ displaystyle f_ {m} = \ sum _ {k = 0} ^ {2n-1} x_ {k} \, e ^ {- {\ frac {2 \ pi i} {2n}} mk} \ qquad m = 0, \ dotsc, 2n-1}$.

The entries with even indices are noted as

${\ displaystyle x '_ {k} = x_ {2k}, \ quad k = 0, \ dotsc, n-1}$

and their DFT of dimension as . ${\ displaystyle n}$${\ displaystyle (f '_ {k})}$

Accordingly, the entries with odd indices are noted as

${\ displaystyle x '' _ {k} = x_ {2k + 1}, \ quad k = 0, \ dotsc, n-1}$

with DFT . ${\ displaystyle (f '' _ {k})}$

Then follows:

{\ displaystyle {\ begin {aligned} f_ {m} & = \ sum _ {k = 0} ^ {n-1} x_ {2k} e ^ {- {\ frac {2 \ pi i} {2n}} m (2k)} + \ sum _ {k = 0} ^ {n-1} x_ {2k + 1} e ^ {- {\ frac {2 \ pi i} {2n}} m (2k + 1)} \\ [0.5em] & = \ sum _ {k = 0} ^ {n-1} x '_ {k} e ^ {- {\ frac {2 \ pi i} {n}} mk} + e ^ {- {\ frac {\ pi i} {n}} m} \ sum _ {k = 0} ^ {n-1} x '' _ {k} e ^ {- {\ frac {2 \ pi i} {n}} mk} \\ [0.5em] & = {\ begin {cases} f '_ {m} + e ^ {- {\ frac {\ pi i} {n}} m} f' '_ { m} & {\ text {if}} m

With the calculation of and ( ), both , and are determined. This decomposition has practically halved the computational effort. ${\ displaystyle f '_ {m}}$${\ displaystyle f '' _ {m}}$${\ displaystyle m = 0, \ dotsc, n-1}$${\ displaystyle f_ {m}}$${\ displaystyle f_ {m + n}}$

## Mathematical description (general case)

In mathematics, the fast discrete Fourier transform is treated in a much more general context:

Let be a commutative unitary ring . In let the number be a unit (i.e. invertible); further be a -th root of unity with . For example is in the remainder class ring${\ displaystyle R}$ ${\ displaystyle R}$${\ displaystyle 2 = 1 + 1}$${\ displaystyle w \ in R}$${\ displaystyle 2 ^ {n}}$${\ displaystyle w ^ {2 ^ {n-1}} = - 1}$

${\ displaystyle R = \ mathbb {Z} / N \ mathbb {Z}}$with , , d is odd (which is equivalent to the requirement "relatively prime to "),${\ displaystyle N = 2 ^ {d2 ^ {n}} + 1}$${\ displaystyle d, n \ in \ mathbb {N}}$${\ displaystyle 2 ^ {n}}$

the element is such a unit root, the corresponding FFT is used in the Schönhage-Strassen algorithm . ${\ displaystyle w = 2 ^ {2d}}$

Then be in the module to the discrete Fourier transform with ${\ displaystyle R ^ {2 ^ {n}}}$${\ displaystyle a \ in R ^ {2 ^ {n}}}$${\ displaystyle {\ hat {a}}}$

${\ displaystyle {\ hat {a}} _ {k} = \ sum _ {j = 0} ^ {2 ^ {n} -1} a_ {j} \ cdot w ^ {j \ cdot k} \ qquad ( k = 0, \ ldots, 2 ^ {n} -1)}$

calculate optimized as follows:

First, we represent the indices and in two ways as follows: ${\ displaystyle j}$${\ displaystyle k}$

${\ displaystyle k = \ sum _ {\ nu = 0} ^ {n-1} k _ {\ nu} 2 ^ {\ nu}, \ quad j = \ sum _ {\ nu = 0} ^ {n-1 } j _ {\ nu} 2 ^ {n-1- \ nu}}$.

So we have the following recursion:

${\ displaystyle A_ {0} (j_ {0}, \ ldots, j_ {n-1}) = a_ {j} \ qquad (j = 0, \ ldots, 2 ^ {n} -1)}$,
${\ displaystyle A_ {r + 1} (k_ {0}, \ ldots, k_ {r}, j_ {r + 1}, \ ldots, j_ {n-1}) = \ sum _ {j_ {r} = 0} ^ {1} A_ {r} (k_ {0}, \ ldots, k_ {r-1}, j_ {r}, \ ldots, j_ {n-1}) \ times w ^ {j_ {r} \ cdot 2 ^ {n-1-r} \ cdot (k_ {r} 2 ^ {r} + \ ldots + k_ {0} 2 ^ {0})}}$.

Because of

${\ displaystyle A_ {n} (k_ {0}, \ ldots, k_ {n-1}) = {\ hat {a}} _ {k}}$

we get the discrete Fourier transform from this . ${\ displaystyle {\ hat {a}} \ in R ^ {2 ^ {n}}}$

## complexity

In contrast to DFT, this classic variant of FFT according to Cooley and Tukey can only be carried out if the length of the input vector corresponds to a power of two. The number of sampling points can therefore be 1, 2, 4, 8, 16, 32, etc., for example. One speaks here of a Radix-2-FFT. Other lengths are possible using the alternative algorithms listed below.

The following recursion equation for the runtime of the FFT results from the above recursion:

${\ displaystyle T \ left (N \ right) = 2T \ left (N / 2 \ right) + f (N)}$

The term describes the effort to multiply the results by a power of the unit root and to add the results. There are N pairs of numbers and adds N / 2 multiplied numbers with unit roots. Overall, f (N) is linearly bounded: ${\ displaystyle f (N)}$

${\ displaystyle T \ left (N \ right) = 2T \ left (N / 2 \ right) + {\ mathcal {O}} \ left (N \ right)}$

With the master's theorem the running time is:

${\ displaystyle T \ left (N \ right) = {\ mathcal {O}} (N \ cdot \ log (N))}$

The structure of the data flow can be described by a butterfly graph , which defines the order of the calculation.

## Implementation as a recursive algorithm

The direct implementation of the FFT in pseudocode according to the above rule takes the form of a recursive algorithm :

• The field with the input values ​​is transferred as a parameter to a function that divides it into two half as long fields (one with the values ​​with an even index and one with the values ​​with an odd index).
• These two fields are now transferred to new instances of this function.
• At the end, each function returns the FFT of the field passed as a parameter. Before an instance of the function is terminated, these two FFTs are combined into a single FFT according to the formula shown above - and the result is returned to the caller.

This is now continued until the argument of a call to the function only consists of a single element (recursion termination): The FFT of a single value is (it has itself as a constant component, and no other frequencies) itself. The function that only receives a single value as a parameter, can therefore return the FFT of this value without any calculation - the function that called it combines the two FFTs, each 1 point long, that it receives back, the function that called it in turn, the two 2-point FFTs, and so on.

${\ displaystyle \ mathrm {function} \ \ operatorname {fft} (n, {\ vec {f}}):}$

${\ displaystyle \ mathrm {if} \ (n = 1)}$
${\ displaystyle \ mathrm {return} \ {\ vec {f}}}$
${\ displaystyle \ mathrm {else} \}$
${\ displaystyle {\ vec {g}} = \ operatorname {fft} \ left ({\ tfrac {n} {2}}, (f_ {0}, f_ {2}, \ ldots, f_ {n-2} ) \ right)}$
${\ displaystyle {\ vec {u}} = \ operatorname {fft} \ left ({\ tfrac {n} {2}}, (f_ {1}, f_ {3}, \ ldots, f_ {n-1} ) \ right)}$
${\ displaystyle \ mathrm {for} \ k = 0 \ \ mathrm {to} \ {\ tfrac {n} {2}} - 1}$
{\ displaystyle {\ begin {aligned} c_ {k} = g_ {k} + u_ {k} \ cdot e ^ {- 2 \ pi ik / n} \\ c_ {k + n / 2} = g_ {k } -u_ {k} \ cdot e ^ {- 2 \ pi ik / n} \ end {aligned}}}
${\ displaystyle \ mathrm {return} \ {\ vec {c}}}$

The speed advantage of the FFT compared to the DFT can be estimated well using this algorithm:

• In order to calculate the FFT of an element-long vector, recursion levels are required when using this algorithm . The number of vectors to be calculated doubles in each level - while their length is halved, so that in the end precisely complex multiplications and additions are necessary in every level except for the last recursion level . So the total number of additions and multiplications is${\ displaystyle 2 ^ {N}}$${\ displaystyle N}$${\ displaystyle n = 2 ^ {N}}$${\ displaystyle N \ cdot 2 ^ {N}}$
• In contrast, the DFT requires complex multiplications and additions for the same input vector .${\ displaystyle (2 ^ {N}) ^ {2}}$

## Implementation as a non-recursive algorithm

The implementation of a recursive algorithm is usually not ideal in terms of resource consumption, since the many function calls required for this require computing time and memory for memorizing the return addresses. In practice, a non-recursive algorithm is therefore usually used, which can be optimized depending on the application compared to the form shown here, which is optimized for easy understanding:

• In the above algorithm, if first the two halves of the field are swapped with each other, and then the two halves of these halves, etc. - then the result in the end is the same, as if all elements of the field were numbered in ascending order from 0 and then the order of the bits of the Field numbers reversed.
• After the input values ​​have been re-sorted in this way, the only task left is to combine the individual short FFTs from the last recursion level to the outside into longer FFTs, e.g. B. in the form of three nested loops:
• The outermost loop counts the recursion level through (from 0 to N-1 ).${\ displaystyle {\ text {level}}}$
• The next loop counts through the FFT sections in which the FFT is still divided in this recursion level. The counter of this loop is referred to below as .${\ displaystyle 2 ^ {N - {\ text {level}} - 1}}$${\ displaystyle {\ text {section}}}$
• The innermost loop counts the element within an FFT section (referred to below ) through (from 0 to )${\ displaystyle {\ text {Element of the section}}}$${\ displaystyle 2 ^ {\ text {level}} - 1}$
• In the innermost of these loops there are always the two samples with the following two indices:
${\ displaystyle {\ text {left}} = 2 ^ {{\ text {level}} + 1} \ cdot {\ text {section}} + {\ text {element of the section}}}$
${\ displaystyle {\ text {right}} = {\ text {left}} + 2 ^ {\ text {level}}}$
combined via a butterfly graph:
${\ displaystyle {\ begin {array} {rcl} x _ {\ text {left, new}} & = & x _ {\ text {left}} + e ^ {- i \ pi {\ frac {\ text {Element of the section }} {2 ^ {\ text {level}}}}} \ cdot x _ {\ text {right}} \\ x _ {\ text {right, new}} & = & x _ {\ text {left}} - e ^ {-i \ pi {\ frac {\ text {Element of the section}} {2 ^ {\ mathrm {level}}}}} \ cdot x _ {\ text {right}} \ end {array}}}$

## Alternative forms of FFT

In addition to the FFT algorithm by Cooley and Tukey, also called the Radix-2 algorithm, shown above, there are a number of other algorithms for fast Fourier transformation. The variants differ in how certain parts of the “naive” algorithm are transformed in such a way that fewer (high-precision) multiplications are necessary. It usually applies here that the reduction in the number of multiplications causes an increased number of additions as well as intermediate results to be kept in the memory at the same time.

Some further algorithms are shown below in an overview. Details and exact mathematical descriptions including derivations can be found in the literature given below.

The radix-4 algorithm is, analogous to it, the radix-8 algorithm or, in general, the radix-2 N algorithm, a further development of the radix-2 algorithm above. The main difference is that the number of data to be processed points is a power of 4 and 2 of N must represent. The processing structure remains the same, except that in the butterfly graph, instead of two data paths, four or eight and generally 2 N data paths have to be linked to one another per element . The advantage consists in a further reduced computational effort and thus a speed advantage. Compared with the above algorithm by Cooley and Tukey, the Radix-4 algorithm requires approx. 25% fewer multiplications. With the Radix-8 algorithm, the number of multiplications is reduced by approx. 40%.

Disadvantages of this method are the coarser structure and a complex program code. With the Radix-4 algorithm, only blocks of lengths 4, 16, 64, 256, 1024, 4096, ... can be processed. With the Radix-8 algorithm, the restrictions are to be seen analogously.

With this algorithm, only a certain, finite number of supporting points of the number is possible, namely: ${\ displaystyle N}$

${\ displaystyle N = \ prod _ {j = 1} ^ {m} N_ {j} \ qquad N_ {j} \ in \ {2,3,4,5,7,8,9,16 \}}$

where all combinations of N j are permissible, in which the N j used are relatively prime . This means that a maximum block length of only 5040 is possible. The possible values ​​for are in the range up to 5040 closer to the number line than the powers of two. This enables better fine-tuning of the block length. The algorithm is built up from basic blocks of the DFT, the lengths of which correspond to N j . With this method, the number of multiplications is reduced compared to the Radix-2 algorithm, but at the same time the number of additions required increases. In addition, a complex permutation of the data is necessary at the input and output of each DFT, which is formed according to the rules of the Chinese remainder of the law . ${\ displaystyle N}$

In practical implementations, this type of fast Fourier transformation has advantages over the Radix-2 method when the microcontroller used for the FFT does not have its own multiplication unit and a great deal of computing time has to be used for the multiplications. In today's signal processors with their own multiplier units, this algorithm is no longer of major importance.

### Prime factor algorithm

This FFT algorithm is based on similar ideas as the Winograd algorithm, but the structure is simpler and thus the multiplication effort is higher than with the Winograd algorithm. The main advantage of the implementation lies in the efficient use of the available memory through optimal adaptation of the block length. If a fast multiplying unit is available in a particular application and memory is tight at the same time, this algorithm can be optimal. With a similar block length, the execution time is comparable to that of the Cooley and Tukey algorithm.

### Goertzel algorithm

The Goertzel algorithm is a special form of efficient calculation of individual spectral components and is more efficient at calculating only a few spectral components ( bins ) than all block-based FFT algorithms, which always calculate the complete discrete spectrum.

### Chirp z transform

Bluestein FFT algorithm for data volumes of any size (including prime numbers ).

## The inverse FFT

The inverse of the discrete Fourier transform (DFT) agrees with the DFT except for the normalization factor and a sign. Since the fast Fourier transformation is an algorithm for calculating the DFT, this naturally also applies to the IFFT.

## Applications

### Computer algebra

Fast polynomial multiplication in sub-square running time.

Classic applications of the fast Fourier transformation can be found, for example, in computer algebra in connection with the implementation of fast polynomial- processing algorithms. As illustrated in the chart on the right can be as rapid multiplication of two polynomials and to realize subquadratic term. First of all, the coefficient sequences corresponding to the two polynomials and corresponding to the two polynomials are transformed by fast Fourier transformation in runtime , so that the Fourier-transformed coefficient sequences corresponding to the polynomial are obtained by component-wise multiplication in runtime . This is ultimately transformed back in runtime using a fast inverse Fourier transformation. The total running time is in and is therefore asymptotically more efficient compared to the classic polynomial multiplication with running time . ${\ displaystyle p_ {a} (x) = \ sum \ nolimits _ {i = 0} ^ {n} a_ {i} x ^ {i}}$${\ displaystyle p_ {b} (x) = \ sum \ nolimits _ {i = 0} ^ {n} b_ {i} x ^ {i}}$${\ displaystyle p_ {c} (x): = p_ {a} (x) \ cdot p_ {b} (x) = \ sum \ nolimits _ {i = 0} ^ {n} c_ {i} x ^ { i}}$${\ displaystyle p_ {a}}$${\ displaystyle p_ {b}}$${\ displaystyle {\ mathcal {O}} (n \ log n)}$${\ displaystyle p_ {c}}$${\ displaystyle {\ mathcal {O}} (n)}$${\ displaystyle {\ mathcal {O}} (n \ log n)}$${\ displaystyle {\ mathcal {O}} (n \ log n)}$${\ displaystyle {\ mathcal {O}} (n ^ {2})}$

### Further areas of application

The other areas of application of the FFT are so varied that only a selection can be shown here:

• Financial math
• Signal analysis
• Acoustics (audio measurements). A relatively trivial application is many guitar tuners or similar programs that benefit from the high speed of the FFT.
• Calculation of spectrograms (diagrams showing the amplitudes of the respective frequency components)
• Reconstruction of the image with the magnetic resonance tomograph or the analysis of crystal structures by means of X-rays , in each of which the Fourier transform of the desired image or the square of this Fourier transform is created.
• Measurement technology / general
• Digital network analyzers that try to determine the behavior of a circuit , a component or a line on a conductor path when operated with any frequency mixture.
• digital signal processing
• Synthesis of audio signals from individual frequencies via the inverse FFT
• To reduce the computational effort for circular convolution in the time domain of FIR filters and replace it with the fast Fourier transform and simple multiplications in the frequency domain . (see also fast folding ). The fast folding offers z. B. the possibility to transport any audio or similar signals with little computing effort through very complex filters ( equalizer, etc.).
• Compression algorithms often use FFT or, like the well-known MP3 format, the discrete cosine transformation that is related to it . The FFT of images or sounds often results in only relatively few frequency components with high amplitudes. This is advantageous when using a method of storing results that requires fewer bits to represent lower numbers, such as B. the Huffman coding . In other cases it is exploited that some of the frequencies can be left out without greatly impairing the result, so that the data flow can be reduced.
• telecommunications
• Long wave reception with the PC
• Broadband data transmission via OFDM , the basis for ADSL and WLAN (Internet), the various DVB transmission standards for digital television e.g. B. via antenna, cable and TV satellite , DRM , DAB (radio) and LTE (4th generation mobile communications). The high speed of data transmission is achieved here by the fact that many relatively slow data transmissions are operated simultaneously on many carrier frequencies . The complex signal, which is created by superimposing the individual signals, is then broken down into individual signal carriers by the remote station using the FFT.

## literature

### Magazine articles

• James W. Cooley, John W. Tukey: An algorithm for the machine calculation of complex Fourier series. In: Math. Comput. 19, 1965, pp. 297-301.
• CM Rader: Discrete Fourier transforms when the number of data samples is prime. In: Proc. IEEE. 56, 1968, pp. 1107-1108.
• Leo I. Bluestein: A linear filtering approach to the computation of the discrete Fourier transform. In: Northeast Electronics Research and Engineering Meeting Record. 10, 1968, pp. 218-219.
• Georg Bruun: z-Transform DFT filters and FFTs. In: IEEE Trans. On Acoustics, Speech and Signal Processing (ASSP). 26, No. 1, 1978, pp. 56-63.
• MT Heideman, DH Johnson, CS Burrus: Gauss and the History of the Fast Fourier Transform. In: Arch. Hist. Sc. 34, No. 3, 1985.

### Books

• Alan V. Oppenheim, Ronald W. Schafer: Discrete-time signal processing. 3. Edition. R. Oldenbourg Verlag, Munich / Vienna 1999, ISBN 3-486-24145-1 .
• E. Oran Brigham: FFT. Fast Fourier transform. R. Oldenbourg Verlag, Munich / Vienna 1995, ISBN 3-486-23177-4 .
• Steven W. Smith: The Scientist and Engineer's Guide to Digital Signal Processing . 1st edition. Elsevier Ltd, Oxford, 2002, ISBN 978-0-7506-7444-7 , chap. 18 (English, dspguide.com ).