Duplex perception of speech

from Wikipedia, the free encyclopedia

Under duplex perception of speech , i.e. the double perception of speech, also simply referred to as duplex perception , one understands a vocal ( phonetic ) or linguistic ( linguistic ) phenomenon in which an acoustic signal is both linguistic and non-linguistic Signal is perceived ( perceived ).

Basically, the duplex perception is just a theory that should be verifiable through various experiments. In these experiments certain syllables were played in which the transitions between the sounds (certain consonants, the plosives , and a vowel) were partially removed. These transitions, i.e. the moment between the pronunciation of the sounds, were also recorded and played in isolation. In the last part of the experiment, the syllable was played without the isolated transition on one ear and the isolated transition on the other ear in order to be able to determine whether the transition was, on the one hand, exclusively as a noise (non-linguistic signal), but also as a Part of the syllable (and thus linguistically) is perceived.

The human ear can usually perceive frequencies in the range from 20 to 20,000 Hertz. A tone does not just consist of a single frequency, but is made up of many partial or partial tones. In order to identify a certain vowel (i.e. a , e , i , o or u ), it is u. a. It is necessary that certain frequencies or frequency ranges are amplified with this vowel sound. These amplified frequency ranges are called formants in acoustics or in phonetics . With the sound o , for example, the range around 500 Hz is amplified and the frequency range around 1000 Hz also forms a so-called peak . When designating the formants (F1, F2, F3, etc.) one goes from the lower frequencies to the higher ones. The sound o has the formant F1 at 500 Hz and the formant F2 at 1000 Hz.

The first two peaks in the lower frequency ranges are decisive for the recognition of the vowels, while the peaks in the upper frequency ranges are less relevant, but more parameters such as the timbre and the like. affect.

A plosive sound (as another sound used in the experiments) is created by a brief interruption of the respiratory flow and the immediately subsequent release of the pent-up air. Vocally, this leads to a small explosion that produces the sound. The release of the pent-up air takes place with the different plosive sounds at different places in the mouth or throat. With sounds b and p , the air is dammed up directly behind the lips, with sounds d and t , the air is dammed up behind the teeth, and with sounds g and k, it is finally dammed up on the back of the tongue or on the palate.

Experiments on duplex perception

In the experiments on duplex perception , test subjects were told syllables that consisted of a combination of a plosive sound ( i.e. one of the consonants b , p , t , d , k or g ) and a vowel. This resulted in syllables like ba , ta , pu , ko or da .

With this combination (syllable), the plosive affects the formants of the following vowel for a brief moment. With syllables like pa , ta or ga , the speech sounds flow smoothly into one another. Since the articulation point is different in the plosives, the following vowel is also approached from a different starting position, whereby the formants of the vowel are briefly influenced. The values ​​for the formants are only reached after a short transition. This transition, known as the transition, can take place from below as well as from above.

For example, the values of the formants at a lie at 1000 Hz (F1) and at 1400 Hz (F2), wherein the syllable ba Angle achieved because the b for a short period of 20 to 40 msec ensures that these frequency ranges are amplified and then adjust to the normal value. This adjustment is known as a glissando because the pitch shifts slightly upwards. With the syllables da and ga , F1 is also reached from below, but F2 from above, so that an opposing glissando is created.

A glissando is much more common in spoken language than in music. It represents the transition from one pitch to another (higher or lower) without creating a jump or a paragraph, as is the case with most musical instruments. The best-known example of a (spoken) glissando is probably the end of a questionnaire, in which the speaker's voice usually goes clearly "upwards" (based on the pitch) on the last two syllables to identify the question.

In music, distinct pitches are used much more often. The change in pitch of a melody can be imagined as a staircase (each step corresponds to a pitch), which changes direction again and again, whereby the pitch differences can be very different (i.e. sometimes larger, sometimes smaller). Glissandi are less common in music because they are not possible with all instruments. A glissando can be thought of as a section of a roller coaster (up or down), where there are no jumps because the track glides from one height to the other.

In the first experiment, the transitions between the sounds were changed by isolating and removing the F3 transition (i.e. the glissando). The test subjects could no longer recognize where the plosive sound was formed in the mouth, so they could not assign the articulation point of the plosive.

If the test subjects were presented with the transition in isolation, they perceived it only as a glissando and thus as a non-linguistic phenomenon.

In the last experiment, if the test subjects were played the defective plosive vowel syllable on one ear and the transition on the other ear, not only was the syllable clearly identified with the point of articulation, but the glissando was also perceived at the same time.

Web links

  • CA Fowler, LD Rosenblum: Duplex perception: a comparison of monosyllables and slamming doors. In: Journal of experimental psychology. Human perception and performance. Volume 16, Number 4, November 1990, pp. 742-754. PMID 2148589 .
  • HK Vorperian, MT Ochs, DW Grantham: Stimulus intensity and fundamental frequency effects on duplex perception. In: The Journal of the Acoustical Society of America. Volume 98, Number 2, Pt 1, August 1995, pp. 734-744. PMID 7642812 .

Literature sources

  • AM Liberman, D. Isenberg, B. Rakerd: Duplex perception of cues for stop consonants: evidence for a phonetic mode. In: Perception & psychophysics. 30 (2), Aug 1981, pp. 133-143.
  • Friedrich Michael Dannenbauer: Linguistic basics. In: Manfred Grohnfeldt (Ed.): Textbook of speech therapy and speech therapy. DNB 959330410 .
  • Gerhard Böhme: speech, speech, voice and swallowing disorders. DNB 951684094 .