from Wikipedia, the free encyclopedia

Prosody is the entirety of those phonetic properties of language that are not tied to the sound or the phoneme as a minimal segment, but to more comprehensive phonetic units. These include the following properties:

  1. Word and sentence accent
  2. the lexical tone based on word syllables in tonal languages
  3. Intonation (of units of more than a syllable range) and sentence melody
  4. Quantity of all phonetic units, especially those of more than segment size
  5. Tempo, rhythm and pauses when speaking.

Parts are referred to by the colloquial expressions stress and tone of voice, which, however, are not terms.

Like many terms of this type, prosody describes both a section of the object domain - i.e. the linguistic properties mentioned - and a subdiscipline of a scientific discipline - in this case phonology and phonetics . Accordingly, prosody is the subject of both linguistics and phonetics.

Origin of the expression

The term prosody (also prosody ) is a foreign word from Latin prosodia from Greek prosōdía (προσῳδία). The roots contained are pros (πρός) 'add' and ōd- (ᾠδ-) 'sing'; the basic meaning is something like 'addition song'. The term primarily referred to the phonetically correct reading of poetry and also included the tone listed above. The Latin loan translation accentus is based on the Greek expression . However, since there is no tone in this sense in Latin, the term accentus has also been narrowed down to the part of prosody, which is also referred to by the term "accent".


Since the properties that fall under prosody have in common that they are located on a phonetic level "above" the segment, they are also called suprasegmental features (suprasegmentalia). A distinction is made between the segmental and the suprasegmental level accordingly. z. For example, the German words circumnavigated '[something] by driving against it to fall' and circumnavigated ' to drive around' are composed in the same way on the segmental level (and also homograph ), but different on the suprasegmental level (and therefore not homophonic ); for the former has the word accent on the first, the latter on the other hand on the second syllable.

The suprasegmentals have the following acoustic basis:

  1. Accent: sound intensity, i.e. primarily relative volume, second relative pitch
  2. Tone: relative pitch (fundamental frequency) and its course within the syllable
  3. Intonation and sentence melody: course of the pitch over syntactic units
  4. Quantity: the relative duration of linguistic units
  5. Tempo, rhythm and pauses: Allocation of linguistic units and their accents to successive periods of time.

The terms are explained in the following section.

Prosodic, psychoacoustic, acoustic and written language features

The prosodic features (or sub-areas) intonation, speaking rhythm and accent are generally described with psychoacoustic features and acoustic, i.e. physically measurable features. In addition, there is a correlation between the prosodic features and the possibilities for highlighting in the written language .

Prosody and acoustics

In acoustics , the phenomena and properties of sound waves are examined. Since speech is based on sound transmission and prosody is a part of speech, prosodic features must also be correlated with acoustic features. The object of investigation is therefore the speech signal. Acoustically measurable properties can be used in the automatic prosody recognition , speaker recognition and speaker verification - the measured properties are then processed into features for pattern recognition .

Base frequency

The intonation of a language can be described acoustically with the basic frequency (unit is Hertz ) of a voice (or the course of the basic frequency, so-called basic frequency contours).


Prosodic permanent features such as rhythm, speaking speed, pauses, stretching, etc. can be measured by measuring the length of these signal segments over time or by forming mean values ​​(mean speaking speed). For example, phoneme lengths are often determined incrementally and then syllable lengths from this. Since these lengths can differ from speaker to speaker, these lengths must be standardized.


Energy characteristics describe the sound intensity (in  dB ) of a speech signal. In pattern recognition, the instantaneous energy is often calculated at frame level, i.e. the energy in a small section of the speech signal. By means of these energy characteristics it can be recognized, for example, whether a speech signal section contains a voice or only silence (differentiation between voiced and unvoiced). In Internet telephony VoIP , sections that do not contain a voice are not even transmitted in order to save bandwidth (however, in technology the relevant measured variable is called amplitude ).

Prosody and psychoacoustics

In psychoacoustics , human perceptions are related to acoustic units in comparative experiments.


The pitch describes the perceived pitch of a sound compared to a 1 kHz signal of a certain sound intensity. It is determined in listening tests . The perceived pitch has a non-linear relationship to the frequency of a tone. Up to 500 Hz there is still a linear relationship on the Zwickers scale , but then doubling the frequency of a tone no longer leads to a doubling of the perceived pitch. The unit of pitch is mel . The changes in pitch correlate with the intonation in prosody.


Loudness is a perception variable that is also determined in listening tests, because it depends not only on sound pressure, but also on frequency and other influencing factors. The unit of loudness is sone . A sone is defined as the perceived loudness of a 1000 Hz sine tone at 40 dB SPL ( sound pressure level , sound pressure level ).

Differences in perceived loudness are often used for accentuation in prosody.

Prosody and written language

In written language, fonts (italic, bold, font size, font) correlate with the prosodic feature accent and intonation , punctuation with the speaking rhythm and with pauses. A linguistic pause is often inserted after a point or a comma. Even dashes that insert a part of a sentence are often replaced by pauses when reading and read with a different intonation. Question marks or callsigns mark question or exclamation sentences and are also marked by special intonation at the end of the sentence.

Functions of prosody

Linguistic and parilingual functions

A distinction is made between linguistic (belonging to the individual language system) and parilingual (other communicative) functions of prosody. Purely linguistic functions include the following:

  • Word accents and tone distinguish words in their meaning.
  • The intonation can distinguish sentence types from one another, e.g. B. the declarative clause and the interrogative clause in German.
  • Pauses, rhythm and intonation divide the speech into meaningful sections, including syntactic units.
  • Sentence accents, intonation and pauses encode the information structure of an utterance, especially the topic and focus . The sentence accent emphasizes an expression against neighboring ones in the sentence and is used for emphasis .

These functions are located between word and text on all linguistic levels. Therefore, the prosody cannot be assigned to a specific grammatical level.

The para-linguistic functions of prosody can be systematized as follows:

  • The speech melody / tone of voice gives expression to emotions and also encodes irony .
  • Languages ​​and varieties (dialects, sociolects, registers) of a language differ in prosodic terms. Suprasegmentalia characterize the speech of members of a language community in a similar way to their sound system, their choice of words or other linguistic properties. Therefore one can assign the way of speaking of a person to such a variety on the basis of them.
  • Since prosodic features are produced with the voice and the articulatory apparatus, and since these are physiological properties of a person, they can characterize and even identify them (by gender, age, etc.).

It is based on prosodic features such as the latter two. B. when you recognize someone on the phone. Also impersonators They take advantage.

In linguistic prosody there are only relative differences, e.g. B. the relative pitch at the end of an interrogative clause plays a role. Para-linguistic prosody is also about absolute differences, e.g. B. the different basic frequency with which a boy and a man speak.

Correlation of prosodic features

Prosodic properties such as changes in intonation, volume and rhythm often occur synchronously instead of individually, so they are correlated. A word is emphasized, for example, by changing the intonation (or pitch ), reducing the speed of speech (for example, by pausing before the word) and pronouncing the word at a higher volume.

Resolution of ambiguities

In the language system, suprasegmental features are just as distinctive as segmental ones . So, like two expressions - e.g. B. does and tot - can only differ in one segmental characteristic, they can also differ only in one suprasegmental characteristic - like the two verbs already mentioned, which are written around . Since the writing reproduces the prosody only imperfectly, certain ambiguities of written texts on different linguistic levels can be resolved with the aid of the prosody when reproduced orally.

Syntactic level

The phrase

  • Erna is not coming but Erwin.

corresponds to two different syntactic constructions, namely

a) Erna comes, but not Erwin.

b) Erna does not come, but Erwin.

The two versions may differ. a. that #a break behind comes , but #b pause after not having. In this case, the punctuation reflects the prosody.

The phrase

  • The man saw the woman with the binoculars.

corresponds to two different syntactic constructions, namely

a) the man saw [the woman with the binoculars] (the woman equipped with binoculars)

b) the man saw [the woman] [with the binoculars] (he looked through binoculars)

These two versions are not different in common parlance, even in prosody. But you can try to clarify version #b with a sharp break in intonation with a pause behind the woman .

Lexical level

In addition to such pairs as homo graph but not homophonic verbs bypass are more like in English translation , insinuate , crowded , etc. (They are, incidentally homograph only in some inflections, but not, for example, in the past participle. (Has) translated vs. ferried .) There are also homographs such as tenor , which means "salary" with an accent on the first syllable, but "high male pitch" with an accent on the second.

Pragmatic level

  • But it's cold here.

Depending on how the sentence is pronounced, it can be indicated that it is only a statement about the temperature (monotonous voice), a request to close a window (negative tone, emphasis on the word cold ) or only the complaint about this condition, which is perceived as negative, which cannot be changed. With a strong emphasis on the word “That”, the statement can also be meant ironically. Thus, the function of a speech act can be better clarified.

Dialogue level

At the dialog level, sentence or phrase boundaries can be marked so that dialogues can be divided into meaningful sections. In this way, linguistic actions can be structured. Known information is de-accentuated (constant intonation), but important information is accentuated.

Prosody levels

According to Hans Günther Tillmann, a distinction is made between A, B and C prosody.

A prosody

A prosody can be controlled at will by the speaker. Parameters of A prosody include intonation, pauses and volume changes. With the help of the A prosody, for example, the sentence intention is conveyed and accentuation is set. It also serves to resolve syntactic and lexical ambiguities. The emotions and physical condition of the speaker can also be conveyed through the A prosody.

Language from which the A prosody is removed is generally perceived as a computer voice.

B prosody

The B prosody is generated involuntarily and describes the syllable rhythm of the mother tongue. It regulates the sequence of voiced and unvoiced sections. Through the B prosody we recognize a signal as speech.

C prosody

The C prosody describes the intrinsic dynamic structure of speech sounds, that is, for example, the correct transitions between neighboring sounds, the sequence of pause, burst and aspiration in plosives or the interplay of voiced excitation and friction in voiced fricatives .


The Mikroprosodie considers fluctuations in the speech signal, such as jitter and shimmer . These fluctuations are mainly found in noisy speech signals. In medicine, conclusions can be drawn about the presence of throat diseases or laryngitis (for example larynx cancer in the early stages) from the measurement of jitter and shimmer alone .

Disorders of prosody

Prosody disorders are common in the autism spectrum , especially in Asperger's syndrome .

See also


  • Hans Günther Tillmann, Phil Mansell: Phonetics. Phonetic signs, speech signals and phonetic communication process. Klett-Cotta, Stuttgart 1980, ISBN 3-12-937910-X .
  • Hadumod Bußmann (Ed.): Lexicon of Linguistics. 3rd, updated and expanded edition. Alfred Kröner, Stuttgart 2002, ISBN 3-520-45203-0 .
  • Wolfgang Hess: Prosody ( Memento from June 28, 2010 in the Internet Archive ). (Slide documentation of a lecture at the University of Bonn; archived on June 28, 2010, accessed on August 18, 2019).
  • Eberhard Zwicker , H. Fastl: Psychoacoustics. Facts and Models. 2nd updated edition. Springer, Berlin et al. 1999, ISBN 3-540-65063-6 ( Springer series in information sciences 22).

Web links

Wiktionary: Prosody  - explanations of meanings, word origins, synonyms, translations