|Waveform Audio File Format (WAVE)|
|File extension :||
|MIME type :||audio / vnd.wave audio / wav audio / wave audio / x-wav|
|Developed by:||Microsoft & IBM|
The WAVE file format is a container format for the digital storage of audio data that is based on the Resource Interchange File Format (RIFF) defined by Microsoft for Windows . A WAVE file contains at least information about its format before the audio data.
It usually contains so-called PCM raw data, i.e. a discrete-time and value-discrete representation of the time course of a signal. The quality of the recorded sound then depends on two values, the sampling rate (number of samples per unit of time) and the resolution (bit depth); in the case of compressed data also from the method, e.g. B. ADPCM or MP3 .
The RIFF format consists of several sections (English chunks ), which are structured like the IFF , except for the byte order : low-order byte (LSB) first. The WAVE specification defines three sections as mandatory: The RIFF section identifies the file as a .wav file and contains the other sections as a container. The FORMAT section contains parameters such as B. the sampling rate. The DATA section contains the waveform.
In the course of the uncoordinated development, an unmanageable number of further section types with partly redundant content emerged. An example is the “Label” section and “Note” section, which both cue point entries in the “Cue” section have a label. A "label" denotes the title of a cue point, "note" a comment. They are stored as subsections (English subchunks ) in the higher-level Associated Data List section. There are also a large number of compressed formats for which a “Fact” section with the decompressed size is binding, but which otherwise define a wide variety of parameters, which makes full support of the WAV format even more difficult for developers.
RIFF section (also "RIFF WAVE" section)
It contains the other sections as a container, its header only consists of
uint32_t, = File length in bytes - 8)
It begins with the identifier "fmt" and must be contained exactly once in the file - namely as the first subsection, but you can just as little rely on this as on the fact that the data chunk is the last. In its
ChunkSizecontent, the general of a set of parameters and a subsequent format-specific part consists follows. The general part:
wFormatTagIdentification for the format used, e.g. B. 0x0001 stands for PCM, the canonical, uncompressed format
uint32_t, Sampling rate in Hz, e.g. 0x0000AC44 stands for 44100)
uint32_t, necessary transmission bandwidth)
uint16_t, Size of the frames in bytes)
For PCM data, the format section only has this one field:
uint16_t, Quantization resolution, identical for all channels)
If compression is not used,
dwAvgBytesPerSecthe product of the sampling rate and the frame size is the product. The frame size results from the specification that all values in the data section are to be encoded as an integer with a just sufficient size in bytes (any necessary filler bits are at the lower end with the value 0, zero padding ). The following applies to the PCM format
wBlockAlign = wChannels * ((wBitsPerSample + 7) / 8)(Integer division without remainder),
so the frame size for 12-bit stereo is not three, but four bytes. With two channels (stereo), first the left, then the right channel is saved.
It has the identifier "data". Its
chunkSizeincludes (as with all sections) neither the 8 bytes of code size and even possibly at the end of the prescribed alignment wherewithal to word boundaries zero byte. Its content is a series of frames.
This format, saved without a header, usually has the ending .raw and requires knowledge of the sampling rate, bit depth and byte order for playback (the latter is only defined under RIFF, not for raw PCM).
The size of the “Data” section in the PCM data format is calculated as follows:
wChannelssamples of one or two bytes each occur per second . For CD quality (16 bit stereo = 4 bytes per sample (2 bytes per channel), 44,100 Hertz) e.g. B. So about 10 megabytes per minute (60 s x 44,100 Hz x 4 bytes).
Example of a generally readable WAVE-PCM format
RIFF header (12 bytes):
|Offset||Type||Length (in bytes)||content|
|4 (0x04)||unsigned||4th||<File size> - 8|
The fmt section (24 bytes) describes the format of the individual samples:
|12 (0x0C)||4th||'fmt'||Header signature (note the following space)|
|16 (0x10)||4th||<fmt length>||Length of the remaining fmt header (16 bytes)|
|20 (0x14)||2||<format tag>||Data format of the samples (see separate table below)|
|22 (0x16)||2||<channels>||Number of channels: 1 = mono, 2 = stereo; meanwhile more than 2 channels (e.g. for surround sound) are possible.|
|24 (0x18)||4th||<sample rate>||Samples per second per channel (e.g. 44100)|
|28 (0x1C)||4th||<bytes / second>||Sample rate · frame size|
|32 (0x20)||2||<block align>||Frame size = <number of channels> ((<bits / sample (of a channel)> + 7) / 8) (division without remainder)|
|34 (0x22)||2||<bits / sample>||Number of data bits per sample value per channel (e.g. 12)|
The data section contains the samples:
|36 (0x24)||4th||'data'||Header signature|
|40 (0x28)||4th||<length>||Length of the data block, max. <File size> - 44|
|44 (0x2C)||<block align>||the first sample (s)|
|<block align>||the second sample (s)|
Data formats (format tag)
|0x0011||DVI / IMA ADPCM|
|0x0017||DIALOGIC OKI ADPCM|
|0x0034||CONTROL RES VQLPC|
|0x0035||CONTROL RES VQLPC|
|0x0037||CONTROL RES CR10|
|0x0039||CS IMAADPCM (Roland RDAC)|
|0x0050||MPEG-1 Layer I, II|
|0x0055||MPEG-1 Layer III (MP3)|
|0x0300||FM TOWNS SND|
Due to the 32-bit fields used in the file format, there is a size restriction of 4 GiB, which corresponds to a playing time of about 6.75 hours with two channels of 16 bit each and 44100 samples per second ( CD quality). With a higher amplitude or time resolution or more channels, the achievable cycle time decreases accordingly. To get around this limitation, Sonic Foundry has introduced an extension to the format that bypasses the file size limit. Since the desktop software division of Sonic Foundry was transferred to Sony Pictures Digital , the format has been called Sony Pictures Digital Wave 64 , or Wave64 for short ; it is provided without license costs. The suggested filename extension is .w64 . The internal structure is deliberately based on the conventional WAVE in order to simplify the software implementation. Using 64-bit fields, based on the assumptions made above, achieves a maximum playing time of over 3 million years.
- Born, Günter: Reference manual file formats. 1990, Addison-Wesley Longman, in various revised editions
- Born, Gunter: File Formats Handbook. 1995, ITP Boston
- WAV audio format ( Memento from December 7, 2010 in the Internet Archive ) (Detailed description as part of the summary of a seminar for the multimedia lecture at Esslingen University .)
- “The * .wav sound format.” Header, data field and example of a PCM WAV file for Windows. ( Memento from January 4, 2016 in the Internet Archive ) (Concise and clear. From a thesis by Thomas Becker and Dirk Manthey at the Giessen-Friedberg University of Applied Sciences )
- WAVE File Format (English, more detailed)
- WAVE Audio File Format - Specifications (English)
- Version 1.0 of the specification (English)
- Timothy John Weber: The WAVE File Format. Answers to Common Questions. (English)
- Resource Interchange File Format Services specification from Microsoft on msdn.microsoft.com (English)
- Thomas Höss and Tobias Rieck: WAV-Audio-Format, fmt-chunk ( Memento from June 27, 2007 in the Internet Archive ) on it.fht-esslingen.de
- VCS Engineering: Sony Wave64. (PDF) Information about Sony Wave64, accessed May 2, 2012.