Parametric audio coding

from Wikipedia, the free encyclopedia

Methods of parametric audio coding ( English parametric audio coding ) are usually used for audio data compression in the lower and lowest bit rate ranges.

technology

The signal is analyzed and broken down into objects, which are described with parameters from which a similar sounding audio signal can be synthesized on the decoder side.

The basic assumption on which a parametric audio encoder is based is that most audio signals, and especially speech, can be synthesized from sine tones and noise. An encoder extracts parameters for amplitude, frequency, sounds (fundamental frequency, amplitude and spectral characteristics of the parts) and noise (amplitude and spectral characteristics) of individual sinusoidal tones from the input signal. This type of encoder can encode audio with a typical 8 k Hz sample rate in 6 to 16 kilobits per second.

A typical codec extracts the sine wave information from the sample values ​​by applying a short-term Fourier transform to the sample values ​​in order to identify the important harmonic content of a frame. By comparing the sine tones across frames, it is possible to group them and separate melody lines ( harmonic lines ) and different sine tones. The adjustment can take into account amplitude, frequency and phase differences. These can be described by fewer bits than autonomous individual sounds would require. The longer a recognized course of identical sounds is, the more bit rate can be saved overall.

The procedure for the decoder is now to lay on top of each other. By filtering the synthesized parts with a Hanning filter , a smooth transition between them can be achieved. This also applies to the encoder, since the short-term Fourier transform achieves better results if the data is pretreated with a Hanning filter.

Just synthesizing the sine tones sounds artificial and metallic. This can be masked by the encoder subtracting the synthesized sinusoidal tones from the input signal and then comparing the residual signal with a linear filter and replacing it with white noise . The parameters obtained can then be quantized, coded and entangled in a bit stream.

application

Processes such as spectral band replication (SBR) and parametric stereo can be assigned to this principle. The common speech codecs of the CELP family also use such approaches. With Harmonic and Individual Lines and Noise (HILN) / MPEG-4 Parametric Audio Coding, there is a method standardized by MPEG that works purely on this principle.

literature

  • Thomas Görne: Sound engineering. Fachbuchverlag Leipzig in Carl Hanser Verlag, Munich et al. 2006, ISBN 3-446-40198-9 .