Record length

from Wikipedia, the free encyclopedia

Record length is determined by how many smaller units a record consists of. It is possible to define the sentence length according to the number of letters or other characters (e.g. in Chinese), sounds , phonemes , morphs , syllables , mores , words , parts of sentences or partial sentences . Before one can edit the sentence length , however, it must be determined what exactly a sentence should be, a problem that is by no means trivial. For example, it must be considered whether the one-word utterance, the ellipse or the sentence fragment should also count as a sentence.

Shortest sentences - longest sentences

One question that can be asked is about the shortest or longest sentences, be it within a specific language or in general. The question about the shortest sentences is easy to answer if you agree that a 1-word utterance should also count as a sentence . Then exclamations like “Fire!” Or interjections like “Ah!” Are one-word sentences . The shortest would then be the Latin imperative “I!” (Command form for the verb “ire”, “to go”). A sentence that is shorter than a letter or sound is not possible.

The situation is different with the question of the longest sentence, to which one contributes some observations and considerations, which one ultimately cannot answer. The reason: You can still insert a word or phrase in a large number of sentences, however long they may be, without the sentence violating grammatical rules. Therefore, an upper limit for grammatically correct sentences cannot be given. In contrast, the use of language sets limits : In spoken language, shorter sentences tend to be used than in written language; but even in the written language sentences are usually limited in length. The question about noticeably long sentences in any text can best be answered. Lang refers to a sentence by the (ancient) Greek author Solon , which is said to be 300 lines long and contains an estimated 4500 to 4800 syllables. Meier reports on a sentence in H. Broch's "The Death of Virgil", which is said to contain 1077 words.

Average record length

In order to determine the average sentence length of texts or text groups, it must first be determined how the sentence length is to be defined. They can be measured by choosing any smaller units and evaluating how many of them are in the sentences. As a rule, the length of a sentence is determined by the number of words or the number of clauses.

As an example, some average values ​​for sentence lengths in German are given, determined by the number of words in the sentence; the data come from Best (2002). The average number of words per sentence in German texts was calculated as follows:

Text group
lower limit of record lengths
upper limit of record lengths
Press releases
9.62
22.91
Prose for children and young people
6.21
12.66
Literary prose
7.08
19.62
Linguistics
25.67
28.73

Further details on the individual texts within a text group are given in the specified work. Of course, the specified values ​​depend on the selection of the evaluated texts. The table gives an impression of how much these average values ​​can fluctuate within a text group. Such a spread of the mean values ​​can also be expected if the sentence length is determined differently than by the number of words per sentence.

Pieper (1979) gives the following overview on the same topic; It should be noted that the data in the two tables are not directly comparable, since Pieper does not use the arithmetic mean but the median as the mean:

x
Text group
Record length (median)
1
radio play
6.64
2
drama
6.49
3
Roman dialogue
6.01
4th
discussion
11.83
5
Novel non-dialogue
12.98
6th
Letters
13.63
7th
Scientific texts
19.22
8th
General legal texts
23.04
9
Newspaper: Agency reports
23.23
10
Newspaper: Own reports
16.37
11
Newspaper: feature section
16.89
12
Newspaper: sports reports
15.09

Sentence length distribution and sentence length in interaction with other linguistic variables

The Quantitative Linguistics has dealt repeatedly and in various ways with the laws of sentence lengths.

  • The law of the distribution of sentence lengths has been well researched , which states that the frequency with which sentences of different lengths occur in texts follow very specific, theoretically justifiable distributions. This law has been examined and supported in several studies on different languages.
  • In texts, sentence lengths are interrelated with other language variables; these interrelationships can be integrated into a complex model.
    • There is an important regularity between the length of the sentences and the length of the sub- clauses : The longer a sentence is, that is, the more smaller units (direct constituents ) it consists of, the smaller these constituents themselves are a language law known as Menzerath's law (also: Menzerath-Altmann law ). An investigation into German was based on the hypothesis “The longer the sentence, the shorter its clauses” and, based on the analysis of German texts, it was possible to show that this legal hypothesis has proven itself.
    • If the sentence length is not related to the direct constituents of the sentence, the partial sentences / clauses, but to the indirect components such as the words, the relationship changes: the longer a sentence, the longer its words. This relationship was mathematically formulated and reviewed and in honor of Hans Arens , probably it first discovered when Arens's Law refers.

Readability

Legibility is understood to be the linguistic (grammatical and lexical) properties of a text; it is part of what makes the text understandable. For a long time now, scientific efforts have been focused on the question of whether the legibility of a text can be measured. A wealth of readability indices have been developed in which, in addition to the word length , the length of the sentence is very often integrated as an essential aspect. In Best (2006) a reason was developed for why such simple criteria as word and sentence length can be valid properties of texts to say something about their legibility.

Text classification

The average sentence length determines the style of a text to a large extent. Wilhelm Fucks , who advocates quantitative literary studies , regards word and sentence lengths as style characteristics, that is, as numerically recorded style characteristics that can be used to distinguish the style of groups of authors. In Werner Winter's Kiel project Quantitative Stylistics , to distinguish text groups on the basis of statistical characteristics of the texts, the criterion of sentence length also plays a role in several respects; the number of words per sentence is taken into account as well as the number of main clause and subordinate clause verbs, whereby the number of sub-clauses is also taken into account. In his attempt at a text typology, Mistrík also emphasizes that sentence lengths are an important criterion for this purpose and not isolated quantities.

Development of sentence lengths

Just like word lengths, sentence lengths are also a quantity that changes over time. A comparison of older authors from the German Classical period with modern authors, which was carried out by Hans Eggers in the Saarbrücken project “Syntax of Contemporary German Language” , indicated a tendency towards shorter sentences; the comparison suffered from the problem that the older authors were writers and the newer non-fiction authors. Studies on the development of sentence lengths in specialist journals between 1800 and 1990 and in artistic texts between 1650 and 1950 confirmed this general tendency, however, with individual outliers. In scientific-technical texts between 1770 and 1960, however, according to observations by Möslein, there is a tendency in which sentence lengths initially increase and then decrease again from 1850, a trend that the lengths of partial sentences also follow. For 1960 it must be stated that either an “outlier” or a trend reversal is evident here; since further data are missing, this has to be left open here. These changes in linguistic usage follow Piotrowski's law .

t time Words per sentence (observed) Words per sentence (calculated)
1 1770 24.50 23.80
4th 1800 25.54 27.36
9 1850 32.00 29.57
14th 1900 23.58 25.57
16 1920 22.72 23.02
18th 1940 19.60 20.40
20th 1960 19.90 17.91

(Explanation: t is the period of time numbered after decades for the calculation. Adjusting the Piotrowski law in the form for the reversible language change to the data observed up to 1960 results in the calculated values ​​given. Adjusting the model results in a coefficient of determination of D = 0.82, where D is considered good if it is greater than / equal to 0.80. For more detailed explanations, reference is made to the given literature.)

literature

  • Karl-Heinz Best : Quantitative Linguistics. An approximation . 3rd, heavily revised and supplemented edition. Peust & Gutschmidt, Göttingen 2006, ISBN 3-933043-17-4 . On page 129 of the book there is a brief overview of the relationships between sentence lengths and other linguistic variables.
  • Karl-Heinz Best: sentence length . In: Reinhard Köhler, Gabriel Altmann, Rajmund G. Piotrowski (eds.): Quantitative Linguistics - Quantitative Linguistics. An international manual . de Gruyter, Berlin / New York 2005, ISBN 3-11-015578-8 , pages 298-304.
  • Reinhard Köhler : Quantitative Syntax Analysis. Dedicated to Gabriel Altmann on the occasion of his 80th birthday. De Gruyter Mouton, Berlin a. a. 2012. ISBN 978-3-11-027292-5 .

Individual evidence

  1. ^ Friedrich Gustav Lang: Writing made to measure . In: Novum Testamentum XLI, 1999, pp. 40–57, information on page 54.
  2. ^ Helmut Meier: German language statistics . Olms, Hildesheim 1967, page 192.
  3. ^ Karl-Heinz Best: Sentence lengths in German: distributions, mean values, language change . In: Göttinger Contributions to Linguistics 7, 2002, pp. 7–31; only the observed values ​​of the record lengths are given here. All the data compiled in the table are based on texts from the 20th century.
  4. Ursula Pieper: About the significance of statistical methods for the linguistic style analysis. Narr, Tübingen 1979, page 50. ISBN 3-87808-355-6 .
  5. Gabriel Altmann: Repetitions in Texts . Brockmeyer, Bochum 1988, pages 63-67. ISBN 3-88339-663-X ; Archived copy ( memento of the original from April 13, 2014 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. . @1@ 2Template: Webachiv / IABot / lql.uni-trier.de
  6. An overview is given in: Karl-Heinz Best: sentence length . In: Reinhard Köhler, Gabriel Altmann, & Rajmund G. Piotrowski (eds.): Quantitative Linguistics - Quantitative Linguistics. An international manual . de Gruyter, Berlin / New York 2005, pages 298–304. ISBN 978-3-11-015578-5 .
  7. Best 2006, page 129.
  8. Gabriela Heups: Investigations into the relationship between sentence length and Claus length using the example of German texts in different text classes . In: Reinhard Köhler & Joachim Boy (eds.): Glottometrika 5 . Brockmeyer, Bochum 1983, pages 113-133. ISBN 3-88339-307-X
  9. Gabriel Altmann: H. Arens' "hidden order" and the Menzerath law. In: Manfred Faust, Roland Harweg, Werner Lehfeldt , & Götz Wienold (eds.): General linguistics, language typology and text linguistics. Festschrift for Peter Hartmann. Narr, Tübingen 1983, pages 31-39. ISBN 3-87808-215-0 .
  10. http://www.glottopedia.de/index.php/Hans_Arens ; Karl-Heinz Best: Hans Arens (1911-2003) . In: Glottometrics 13, 2006, pages 75-79 (PDF full text ).
  11. ^ Gabriel Altmann , Michael Schwibbe : The Menzerath law in information processing systems. Olms, Hildesheim, Zurich, New York 1989, pages 46-48. ISBN 3-487-09144-5 .
  12. Norbert Groeben: Reader Psychology: Text Understanding, Text Understanding . Münster: Aschendorff Verlag, 2002, pages 175-183. ISBN 3-402-04298-3 .
  13. Karl-Heinz Best: Are word and sentence length useful criteria for the legibility of texts? In: Sigurd Wichter, Albert Busch, (Ed.), Knowledge Transfer - Success Control and Feedback from Practice . Lang, Frankfurt / M. u. a. 2006, pages 21-31. ISBN 3-631-53671-2 .
  14. ^ Wilhelm Fucks: According to all the rules of art. Deutsche Verlags-Anstalt, Stuttgart 1968, page 33.
  15. Ursula Pieper: About the significance of statistical methods for the linguistic style analysis. Narr, Tübingen 1979, especially page 45. ISBN 3-87808-355-6 .
  16. Jozef Mistrík: Exact typology of texts. Verlag Otto Sagner in commission, Munich 1973, page 30ff.
  17. ^ Heinz-Helmut Lüger: Press language. 2nd, revised edition. Niemeyer, Tübingen 1995, page 23. ISBN 3-484-25128-X .
  18. Kurt Möslein: Some development tendencies in the syntax of scientific-technical literature since the end of the 18th century. In: Walther von Hahn (editor): Technical languages. Wissenschaftliche Buchgesellschaft, Darmstadt 1981, pages 276-319, on the length of sentences page 303f. ISBN 3-534-07141-7 . First published in 1974.
  19. ^ Karl-Heinz Best: Sentence lengths in German: distributions, mean values, language change . In: Göttinger Contributions to Linguistics 7, 2002, pages 7–31, on the development of sentence lengths, pages 22–27, table on page 25, slightly corrected.
  20. ^ Gabriel Altmann : The Piotrowski law and its generalizations. In: Karl-Heinz Best , Jörg Kohlhase (Ed.): Exact language change research. Theoretical contributions, statistical analyzes and work reports (= Göttinger Schriften zur Sprach- und Literaturwissenschaft. Vol. 2). edition herodot, Göttingen 1983, ISBN 3-88694-024-1 , pages 54-90, on the reversible language change: page 78ff.

See also

Web links

Wiktionary: sentence length  - explanations of meanings, word origins, synonyms, translations