from Wikipedia, the free encyclopedia

There are formants in human language as well as in musical instruments.
phonetics and acoustics the concentration of acoustic energy in an unchangeable (fixed) frequency range , independent of the frequency of the generated fundamental tone, is called a formant (from the Latin formare 'form' [a vowel ]) . Due to the resonance and interference properties of the articulation space or resonance body, these frequency ranges are amplified compared to the other frequency ranges and the others are attenuated, after which the formants remain as energy peaks.

Definition of terms

An overtone (partial) or a coherent range of overtones (partials), which are raised in level at characteristic natural frequencies through resonance amplification, are called formants.

On the other hand refers to the frequency range , which is characteristic of a vowel, as a formant region (also: Formant stretch ).

Observation and description

As a phenomenon, a phone (in the phonetic sense) or “single tone ” in musical terms are the smallest acoustic units.

In order to avoid misunderstanding, a fundamental distinction must be made between measurable quantities and perceived quantities.

The sound sources can first be broken down into three sub- components:

  • the actual oscillator (such as a string , a membrane or the plica vocalis , which periodically interrupt the outflow of breathing air) (which is initially only there without sounding),
  • the stimulation (by plucking, blowing, the periodically interrupted air flow of the respiratory apparatus apparatus respiratorius ) and
  • the resonance body (i.e. the body of the musical instrument, the resonance chambers of the human body).

The most important change in the resonance properties occurs, as a variable variable, by changing the position of the tongue. The basic speech frequency is around 100–150 Hz for men and around 200–300 Hz for women. In the physical sense, such a “single tone” can be further broken down into different partials or partial or overtones , i.e. into different frequency bands. The lowest partial is decisive for the perceived pitch . It is also known as the fundamental frequency or the fundamental tone . Overall, the partial tones can be used to describe all sounds in music as well as acoustic speech production or, more generally, any other acoustic events. Almost all tones, sounds and noises as well as human speech are made up of a whole series of partial tones. All of these partials are in the form of sine waves . An overall tone consists e.g. B. from ten partial tones, that is then nine overtones and a fundamental. Which frequencies occur as overtones depends on the physical properties of the respective sound generator, i.e. on its "natural frequencies". Sounds with "harmonic" are distinguished from those that belong to the "non-harmonic overtone series". In the area of ​​harmonic overtone series, the frequencies of the overtones are integral multiples of the frequency of the fundamental ( natural tone series ). In the field of musical instruments, these include string and wind instruments .

In the case of “non-harmonic series of overtones”, the frequencies of the partial tones form complicated, non- integer relationships to one another. Such sounds occur in music with instruments with noisy tones, for example with percussion instruments such as drums or with idiophones such as bells with metallic timbres. The number of overtones and their relationship to one another describes only part of an acoustic event that is perceived as an overall sound. The volume of the individual overtones is also important.

Human language sounds different to different speakers. The reason for this is essentially the vocal sound, which can be different for the same pitch. Because the sound should actually be the same if two people sang the same note. Due to the individual anatomical shape, i.e. the size and shape of the oral cavity, paranasal sinuses, throat, etc., which form the main resonance spaces in humans, some frequencies are amplified, others are weakened. The overtones are responsible for such language-related resonance curves. So the same vowel will produce different resonances in different people. But in addition to vowels, human languages ​​also make use of consonants  - the noises in which the airflow is inhibited during pronunciation and which therefore have a limited acoustic range. The situation is different with vowels, which are pronounced without inhibiting the flow of breath and can therefore be heard more clearly.

Explanations for the definition

In the larynx or z. B. In the mouthpiece of a wind instrument , a fundamental tone with numerous overtones is first produced. Only in the sound body of a musical instrument or on the way between the larynx and mouth opening is part of the harmonics , i.e. partial or partial tones or overtones and noise components , attenuated, while another part is amplified by resonance relative to the fundamental frequency and other overtones . The areas where there is maximum relative gain are the formants. Voices and instruments often have multiple formant regions that are not directly adjacent to one another.

The position and characteristics of the formants have a significant impact on the timbre of a musical instrument or voice . They can be used to distinguish between voices and musical instruments - for example the voices of two women or one violin from another.

The position of the formants depends

  • generally from the characteristic natural frequencies of the instrument or sound generator,
  • In the case of mechanical musical instruments, the design and the materials used, in particular the design of the sound box,
  • in the human voice from the arbitrarily changed shape of the vocal tract , as it is adjusted to articulate a certain sound by muscle movements,
  • in the case of electronic musical instruments, of the bandpasses and bandstops used .


Spectrogram of the sounds [i, u, ɑ] in American English, formants F1, F2 marked in red. Formants are the horizontal frequency bands.

Speech, and thus speech sounds, consist of air pressure waves that are ejected from the oral and nasal cavities . The breath that is forced through the vocal cords causes them to start vibrating. The vibrations become a keynote that is shaped and amplified by the oral and nasal canals or other anatomical features. The more air you breathe through your vocal folds, the louder the sound. The different positions of the tongue and lips allow different sounds to be formed. The opening and closing of the vocal cords create a periodic vibration. The length of a cycle depends on the length, mass, and tension of the vocal cords, as well as the air pressure generated by the respiratory muscles and lungs.

The vocal articulation is usually voiced, the relevant variations being changes in the size of the throat and mouth. These are caused by the tongue and lips, but larynx height , throat tightness, tongue position and height as well as the lip position change the resonance properties of the extension tube and thus also the resonance frequencies of the resulting vowel. In this way, each vowel receives its typical spectral composition with energy concentrations in the respective resonance frequencies. These energy concentrations, which can be recognized in the sonagram as horizontal frequency bands, are called formants F1, F2, F3 and F4 etc. In human speech, the position of the formants characterizes the meaning of certain sounds . Vowels in the sonagram differ from consonants mainly in their clear formant structure. This is because the sound, the articulation of which leads to a consonant, is created by a narrowing of the vocal tract, so that the breathing air flow is completely or partially blocked and audible turbulence (air eddies) occurs. Consonants are sounds that overcome obstacles; they can be produced without using the voice ( voiceless ) or with vocalization ( voiced ). The tendency shows the following: Vowels are more likely to be in a lower frequency range, the consonants in a higher frequency range. While the vowels mainly generate the loudness of speech, the word differentiations ( syllables ) are transmitted via the consonants . A vowel can be articulated in different pitches by changing the period of the vocal cord movement without changing the mouth and throat.

In acoustics and phonetics, the formant is the concentration of acoustic energy in a certain frequency range. While the formants F1, F2 and F3 are vowel-specific, which means that they always assume approximately the same frequency values ​​regardless of the speaker, the frequency values ​​from the F4 formant are mainly responsible for the timbre and characteristics of the speaker's voice. They are primarily used to identify a speaker and not a vowel. In the sonagram, vowels differ from consonants primarily in their clear formant structure.

Formants arise, for example, in the resonance spectra of musical instruments or the human voice . Due to the resonance properties of an instrument or the human articulation space, certain frequency ranges are amplified in relation to other frequency ranges. Formants are those frequency ranges in which the relative gain is highest. Vowels, for example, differ in terms of articulation by three parameters:

  • the vertical position of the highest point of the tongue,
  • the horizontal position of the highest point of the tongue and
  • the roundness of the lips

Using the first two formants in the vowel triangle or in the vowel trapezoid , all vowels of a sound system can be distinguished from one another. The vowel formant positions differ from person to person, especially between men, women and children. Here is a table of the averaged formant positions from the vowel triangle mentioned.

Table 1: Averaged
formant positions from the vowel triangle vowel formant centers
German vowel IPA Formant F1 Formant F2
U u 320 Hz 800 Hz
O O 500 Hz 1000 Hz
å ɑ 700 Hz 1150 Hz
A. a 1000 Hz 1400 Hz
ö O 500 Hz 1500 Hz
ü y 320 Hz 1650 Hz
Ä ɛ 700 Hz 1800 Hz
E. e 500 Hz 2300 Hz
I. i 320 Hz 3200 Hz

The first two formants F1 and F2 are important for the intelligibility of the vowels . Their position characterizes the spoken vowel, the third and fourth formants F3 and F4 are no longer essential for understanding speech. They rather characterize the anatomy of the speaker and his articulation peculiarities as well as the timbre of his speech and vary depending on the speaker. The character of a voice is determined by the basic frequency (100 to 250 Hz) and the articulation characteristics. The mean speaking voice range is between 100 and 130 Hz for men and between 200 and 260 Hz for women.

Formants, which are 1500 to 2000 Hz, the effect of bringing Näseleffekts out why they Näselformanten be mentioned. If the velum is opened, a second nasal formant occurs, often also added. Various studies are available for this that have shown different nasal formants. The first nasal formant is given with values ​​between 200 and 250 Hz, the second nasal formant very different with values ​​of e.g. E.g. 1000, 1200, 2000 or 2200 Hz.

Formant Frequency spectrum (Hz) (male) assigned resonance space
F0 80-200 Vocal folds, voice
F1 220-780 throat
F2 1200-2000 Lip space
F3 2200-3000 Oral cavity
F4 3350-5100 Coronal space (space behind the upper jaw and zygomatic bone )

Table 2: Typical spectral enhancement (fifth to octave width),
which is specifically used in the sound recording of vocals and instruments.
Practical level increases
High amplitude at Sound sensation comment
200 to 400 Hz sonorous 1. Formant u
400 to 600 Hz full 1st formant or similar
800 to 1200 Hz striking 1. Formant a
1200 to 1800 Hz nasal 2. Formant ü
1800 to 2600 Hz bright 2. Formant e
2600 to 4000 Hz brilliant 2. Formant i
8000 Hz pointed diffuse "heights"
over 10000 Hz sharp Overtone "shine"

Special features in singing

The sound and spectral analysis clarifies the vowel formants as frequency ranges with increased intensity

Basically the same applies to singing as to language. The above Formants can be used particularly well for low notes, e.g. B. show sung in the snarling register . But even in the higher range of a soprano voice, the fundamental frequency is above the 1st formant frequencies listed in Table 1. At frequencies of e.g. B. 700 Hz, the vowels u, e and i would have to be incomprehensible and, because of the strong damping between the formants, only form weak, unstable tones. However, according to Sundberg, the formants are not independent of the keynote. This independent variation of the formants is practiced in overtone singing , for example . If the root note falls within the range of the 1st formant or is above, then the 1st formant also rises as the root note rises. The singer achieves this by opening her mouth wider. This adaptation of the first formant is known as formant tuning. In the case of i, u, e it leads to an increase in the 1st formant; with a fundamental frequency of 700 Hz it is also around 700 Hz. In the case of a, it remains largely constant. The 2nd formant, on the other hand, decreases with e and i and increases with u. The rise of the 1st formant does not go on "infinitely", however, in the area around h2 and above you can no longer do anything with further opening of the mouth. The vowels are no longer distinguishable with very high notes, because the fundamental frequency is now always above the first formant and the sound impression of this formant disappears.

Frequencies around 3 kHz play a crucial role in the carrying capacity of a voice. That is why this frequency range is called singer formant . It can also be changed, for example, by training in raising or lowering the larynx while singing. A singer formant is well developed when the frequencies in a sung tone in a broad band between 2800 and 3400 Hz have a "relative strength", regardless of the fundamental tone.


The term formant was first used in 1890 by Ludimar Hermann in his acoustic phonetics , but was not technically described by Erich Schumann in his habilitation thesis in Berlin until 1929 and today forms a broad field of research in analytical, communications engineering and sound synthesis domains.

See also


  • Franz Brandl: The art of voice training on a physiological basis. Self-published, Munich 2001, ISBN 3-00-008593-9 .
  • Michael Dickreiter, Volker Dittel, Wolfgang Hoeg, Martin Wöhr (eds.): Manual of the recording studio technology. 8th, revised and expanded edition, Walter de Gruyter, Berlin / Boston 2014, ISBN 978-3-11-028978-7 or e- ISBN 978-3-11-031650-6 (2 volumes).
  • Ludimar Hermann : Contributions to the teaching of sound perception. In: Pflügers Arch . Volume 56, 1894, pp. 467-499.
  • Fritz Klingholz: Medical Guide for Singers. Libri Books on Demand, Seefeld 2000, ISBN 3-8311-0493-X .
  • Paul-Heinrich Mertens: Schumann's sound color laws and their meaning for the transmission of language and music. E. Bochinsky, Frankfurt / M. 1975, ISBN 3-920112-54-7 .
  • Jürgen Meyer: Acoustics and musical performance practice. E. Bochinsky, Frankfurt / M. 2004, ISBN 3-932275-95-0 .
  • Christoph Reuter : timbre and instrumentation. Habil. Lang, Frankfurt 2002, ISBN 3-631-50272-9 .
  • Erich Schumann: Physics of timbres. Habilitation thesis at the University of Berlin, 1929.
  • Erich Schumann: Physics of timbres. Breitkopf & Härtel, Leipzig 1940 (Volume II).
  • Johan Sundberg: The Science of the Singing Voice. Translated by Friedemann Pabst, Orpheus, Bonn 1997, ISBN 3-922626-86-6 .
  • Uta Konzelmann: Vocal field measurements in choir singers before and after exercise with special consideration of the singer formant. Diss., Erlangen / Nuremberg 1989.
  • Hannes Raffaseder: Audiodesign. Fachbuchverlag Leipzig, 2002.
  • Wolfgang Saus: Chorphonetics - when vowels control the intonation. VOX HUMANA 11.1, February 2015, pp. 22–26 (PDF; 170 kB).
  • Eglė Alosevičienė: Basics of Phonetics and Phonology. Vilnius University, Faculty of Humanities, Kaunas 2009, ISBN 978-9955-33-413-2 (PDF; 929 kB).


  • Bernhard Richter, Matthias Echternach, Louisa Traser, Michael Burdumy, Claudia Spahn: The Voice. Insights into the physiological processes involved in singing and speaking. 2017, Helbling, DVD-ROM.

Web links

Wiktionary: Formant  - explanations of meanings, word origins, synonyms, translations

Individual evidence

  1. articulation. Modification of the air flow. Part A. Consonants, Phonetics. (PDF; 711 kB), University of Munich, accessed on May 6, 2020.
  2. See e.g. B. Fabian Bross: Basics of Acoustic Phonetics . In: Helikon. A Multidisciplinary Online Journal . tape 1 , 2010, p. 101 f . ( [PDF; 1,2 MB ; accessed on May 6, 2020]).
  3. Alexander Berghaus , Gerhard Rettinger, Gerhard Böhme: ear, nose and throat medicine . Hippokrates, Stuttgart 1996, ISBN 3-7773-0944-3 , K. Phoniatrie und Pedaudiology, 1.1.2 Investigation methods, p. 649 ( [PDF; accessed June 28, 2020]).
  4. ^ After Christian Lehmann: Die Sprachlaute I: Vokale. April 19, 2019, accessed May 6, 2020.
  5. ^ Bernhard Richter, Matthias Echternach, Louisa Traser, Michael Burdumy, Claudia Spahn: The voice. Insights into the physiological processes involved in singing and speaking. 2017, Helbling, ROM-DVD.