Audio bit depth

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by EliasGwinn (talk | contribs) at 15:25, 12 November 2007 (→‎Digital Audio). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In digital audio, bit depth describes the number of bits per sample, measured in bits. Bit depth directly corresponds to the resolution of each sample in a set of digital audio data. Common examples of bit depth include CD audio, which is recorded at 16 bits, and DVD-Audio, which can support up to 24-bit audio.

Digital Audio

A set of digital audio samples contains data that, when converted into an analog signal, provides the necessary information to reproduce the sound wave. In pulse-code modulation (PCM) sampling, the bit depth will limit qualities such as dynamic range and signal-to-noise ratio. The bit depth will not limit frequency range, which is limited by the sample rate.

By increasing the sampling bit depth, smaller fluctuations of the audio signal can be resolved (also referred to as an increase in dynamic range). The 'rule-of-thumb' relationship between bit depth and dynamic range is, for each 1-bit increase in bit depth, the dynamic range will increase by 6 dB. 24-bit digital audio has a theoretical maximum dynamic range of 144 dB, compared to 96 dB for 16-bit. However, current digital audio converter technology is limited to dynamic ranges of ~115 dB because of 'real world' limitations in integrated circuit design (see data sheet for AD1853).

Technically speaking, bit depth is only meaningful when applied to pure PCM devices. Non-PCM formats, such as DSD or lossy compression systems like MP3, have bit depths that are not defined in the same sense as PCM. This is particularly true for lossy audio compression, where bits are allocated on a per tone basis, and the bits actually allocated to individual samples are allowed to fluctuate almost randomly within the constraints imposed by the allocation algorithm. Recently, many lossy formats such as DTS and WMA Pro have been promoted as 24 bit. However, this is not correct. A lossy file will not actually contain 24 bits worth of information per sample, but is actually a file that was originally mastered at 24 bits and then compressed.

Signal To Noise Ratio (SNR)

The importance of bit depth in PCM audio is that it determines the maximum possible signal-to-noise ratio (SNR) of the signal. For a typical PCM recording, in which no noise shaping is employed and the frequency range extends most of the way to the Nyquist Limit, the SNR in dB at all frequencies is equal to 1.92 + 6.02 * bits. This formula is often simplified to 6 dB per bit, which yields the common value of 96dB for 16 bit CD audio. Note however that this value is only the simplest case, and higher or lower SNRs are possible under special conditions or with post processing.

It should be restated that this is only valid for plain PCM audio without postprocessing. Systems such as DSD use a different modulation technique where the SNR is not determined exclusively by the sample size and the audio band does not extend to the Nyquist Limit. Other schemes such as lossy and encoding use adaptive sample size and frequency domain transforms that aim to allow reproduction of good sounding audio with very low bit depths (often averages of 4-6 bits per sample gives good results with modern compression).

What is a 'bit' of data?

See also: bit

In computing parlance, bit is the abbreviation used to mean a single 'binary digit', represented by an 0 or a 1. Within the computer, this represents an electronic switch in an 'on' or 'off' state. 16-bit means there are sixteen digits, all ones or zeroes, e.g. 1001011011001010. Binary is base-2; thus, each column can only be one or zero. Make the bit value -- in this scenario we'll use 16 -- the exponent:
216 (or) 2x2x2x2x2x2x2x2x2x2x2x2x2x2x2x2 = 65,536

This means that each sample can contain any one of 65,536 unique values, made up of sixteen ones and zeroes.

Bit Rate

Bit rate refers to the amount of data, specifically bits, transmitted or received per second.

One of the most common bit rates given is that for compressed audio files. For example, an MP3 file might be described as having a bit rate of 160 kbit/s or 160000 bits/second. This indicates the amount of compressed data needed to store one second of music.

The standard audio CD is said to have a data rate of 44.1 kHz/16, implying the audio data was sampled 44,100 times per second, with a bit depth of 16. CD tracks are usually stereo, using a left and right track, so the amount of audio data per second is double that of mono, where only a single track is used. The bit rate is then 44100 samples/second * 16 bits/sample * 2 = 1411200b/s or 1.4 Mbit/s.
This explains why, for example, a Minidisc recorder, which uses ATRAC compression, can store files lasting twice as long on a disc, if the default, recording in 2 channel stereo, is set to single channel mono recording.

To fully define a sound file's digital audio bit rates, the sampling rate, word size, number of channels, (e.g. mono, stereo, four-track), and format of the data also need to be known.

Calculating Values

There is an easy way to determine a file's bit rate when given sufficient information. In fact, as long as you know any three of the following four values, you can calculate the missing value.

Bit rate = (bit depth) x (sampling rate) x (number of channels)

For a recording with a 44.1 kHz sampling rate, 2 channels (stereo) and a 16 bit depth:
16 x 44100 x 2 = 1411200 bits per second, or, 1411.2 kbit/s

Sources

Much of the information in this article can be found in Principles of Digital Audio, 4th Edition (Pohlmann, McGraw Hill) with some contributions made by one or more users knowledgeable in the area of digital audio; the book was not the specific reference for this article. However, it is one of possibly many printed sources for this information.

See also