Replay Gain

from Wikipedia, the free encyclopedia

ReplayGain (from the English: playback gain ) is a standard that describes how digital audio files to a common perceived loudness to be raised, without touching the data stored in the file the actual audio data. The proposal for this standard was published on July 12, 2001 by David Robinson.

Basics

If tracks from different albums, especially albums with different production dates, are listened to one after the other, there is sometimes a very different volume perception. The reason for this can be the volume of the individual track desired by the producer in the context of an album. In most cases, however, the reason lies in the different mastering of different albums or, above all, in the "target volume" that has changed over the years. (For background information, see article Loudness War .)

The volume peak value, which is sometimes only a few milliseconds long, has very little influence on the perceived volume, but is important for the control of the entire piece. Traditionally, the adjustment is done by changing the volume setting. With the possibility of putting together pieces yourself ( playlists ) or the possibility of cross-album random playback, the desire for automated volume normalization comes to the fore.

Although the term was written as Replay Gain in the original publication , the spelling Replaygain or ReplayGain is increasingly used.

technology

It is used in two stages: First, the required loudness information is determined once from the audio data and stored together with this as meta information . The volume is then adjusted every time this information is played back.

First, the files in question are completely decoded and analyzed. A value is calculated (via the effective value ) that should come close to the perceived average volume, and the actual peak value is recorded. This is written to the file as additional meta information as a correction value, which brings the difference between the recorded perceived average volume and a uniform level of 89 dB - the rest of the file remains untouched.

Only when playing can a decoding program, provided it supports the standard, read out these values ​​and use them at the moment of decoding to correct the actual audio signal .

In order not to let a single piece of music fall out of the overall concept of an album, the average volume of this album as a whole can be calculated and saved in the audio file. If this correction value is used during playback, the (intended) relative volume differences between the individual tracks on the album are retained.

Since the adaptation takes place during decoding, i.e. it is only a matter of tagging, the rest of the file remains unaffected. The changes can also easily be removed again; they will be ignored by an incompatible decoding program. The correction is optimally done before lossy compressed files are quantized to the desired final scanning depth, so that, if necessary, the full dynamic range offered by the respective final scanning depth can be used.

This allows replay gain compatible audio players to compensate for the differences and to play such files with roughly the same average (perceived) volume. This avoids having to manually adjust the volume every time pieces mastered at different levels are played back one after the other. (This adjustment is not to be confused with the usual level control , in which instead of the average perceived volume, the peak levels of the individual pieces are brought to a uniform value.)

The replay gain standard speaks of an 8- byte area in the header data of the file, which should be the same for all audio formats, but many formats such as Vorbis or FLAC have their own tag for this information. For MP3 files, programs like foobar2000 use the method of writing ID3v2 tags of the TXXX type to the file. For some time now, the ID3v2 standard has also provided for an “RVA” field (Relative Volume Adjustment) that can be used for replay gain purposes.

Alternatives

Change audio data, recoding

If the addition of metadata is not desired or not possible (e.g. if there is no support from decoders or burning programs), the output audio data can also be changed as an alternative in order to bring the perceived volume to the specified level. This is not only very complex, but also associated with certain sound losses due to the arithmetic operations (increased noise, increased distortion, at least at 16 bit or less). When the volume is reduced, the transmitted dynamic range is also reduced. However, it is not always possible to increase the volume without interfering with the dynamics ( generation losses ). However, a scaling factor can be changed (reversibly) for some coding types, but not in any fine steps.

MP3, AAC and Global Gain

The MP3Gain program can do this for MP3 files in a lossless and reversible way (but only with an accuracy of 1.5 decibels, which is usually sufficient in practice  ). To do this, the global gain fields of the individual frames, which determine the overall level of the individual MP3 frame, are manipulated. The operation is performed directly on the MP3 structure. In most cases it is reversible. Since no recoding takes place, there are no generation losses. In addition, a tag is optionally added to the file, which identifies the correction made; with its help the operation can be undone later if necessary.

The same applies to AACgain for Advanced Audio Coding , and to Vorbisgain for Ogg Vorbis files.

See also

literature

  • Thomas Görne: Sound engineering. Fachbuchverlag Leipzig in Carl Hanser Verlag, Munich et al. 2006, ISBN 3-446-40198-9 .
  • Roland Enders: The home recording manual. The way to optimal recordings. 3rd, revised edition, revised by Andreas Schulz. Carstensen, Munich 2003, ISBN 3-910098-25-8 .

Web links

swell

  1. http://www.id3.org/id3v2-00?highlight=%28rva%29