Frequency dictionary

from Wikipedia, the free encyclopedia

Frequency dictionaries (also: frequency dictionaries ) reproduce the vocabulary of a language , an author , a text type, etc., whereby the frequency with which the individual words appear in a text or text corpus is the most important goal of the compilation. If it is not possible to carry out an overall evaluation of the relevant texts, it is a necessary prerequisite for the creation of a frequency dictionary that one can evaluate a sufficiently representative text corpus for the relevant subject area so that the frequency lists obtained provide a picture of the totality of the data.

The best-known frequency dictionaries are probably those that are supposed to represent the vocabulary of a language as a whole. They make it possible to obtain information about which vocabulary is used most often and therefore has to be learned first, e.g. B. in mother tongue lessons or in foreign language lessons. You can use it for very practical purposes. However, other applications beyond the satisfaction of pure curiosity are also possible: Some elementary knowledge about language can be gained from the frequency dictionaries: The best-known of Zipf's laws is one of them, which says that the product of the rank and frequency of the words approximately constant size. Further implications: the more common the words are, the shorter they are, but the older they are. Frequency dictionaries have both practical and theoretical uses and are an important working basis for language statistics and, in addition, quantitative linguistics .

Frequency dictionaries of German

The individual dictionaries can be designed quite differently, even if the text frequency is the top criterion. Groundbreaking was Kaeding (1897/98), a company that recorded over 11 million words and that became a model for corresponding companies in other languages. In this dictionary the words are arranged alphabetically on the one hand, and according to frequency on the other; one receives information about the frequency of the word stem and the word formation activity of the word. Wängler (1963) contains the vocabulary of a corpus of spoken language and a newspaper corpus, both separately and together, also sorted alphabetically and according to frequency, but is limited to words that have occurred at least 10 times. Meier (1967) gives an alphabetical list of Kaeding's vocabulary, also limited to those that have been used at least 10 times. Rosengren (1972/77) lists the vocabulary from Süddeutsche Zeitung and Die Welt for the period from November 1, 1966 to October 30, 1967 in alphabetical order, in reverse order and according to frequency, and provides further information on the text coverage of the words in the various newspaper sections. Finally, Ruoff (1981) compiles 500,000 words of spoken German, alphabetically, in declining order and according to frequency, but always separated by parts of speech.

This small overview of frequency dictionaries for the German language as a whole is incomplete and is only intended to show that the organization of the vocabulary and the evaluations are by no means always the same. It is clear that e.g. B. for the purpose of subject-specific language-oriented teaching, different vocabulary selections must be made. (On the types, production and use of frequency dictionaries: Alexeev 1984, 2005.)

literature

  • PM Alexeev: Statistical Lexicography. Translated by Werner Lehfeldt. Brockmeyer, Bochum 1984. ISBN 3-88339-361-4
  • Pavel M. Alexeev: Frequency dictionaries. In: Reinhard Köhler, Gabriel Altmann, Gabriel, Rajmund G. Piotrowski (eds.): Quantitative Linguistics - Quantitative Linguistics. An international manual . de Gruyter, Berlin / New York 2005, pp. 312-324. ISBN 3-11-015578-8
  • Friedrich Wilhelm Kaeding (Ed.): Frequency dictionary of the German language 1, 2nd self-published, Berlin-Steglitz 1897/98. (Partial reprint in: Basic Studies from Cybernetics and Spiritual Science 4/1963. Supplement)
  • Snježana Kordić : Words in the border area of ​​lexicon and grammar in Serbo-Croatian (=  Lincom Studies in Slavic Linguistics . Volume 18 ). Lincom Europa, Munich 2001, ISBN 3-89586-954-6 , LCCN  2005-530314 , OCLC 47905097 , DNB 963264087 , p. 280 .
  • Helmut Meier: German language statistics . 2nd, enlarged and improved edition. Olms, Hildesheim 1967.
  • Inger Rosengren: A frequency dictionary of the German newspaper language . Vol. 1, 2. Gleerup, Lund 1972/77. ISBN 91-40-04470-X
  • Arne Ruoff: Frequency dictionary of spoken language . Niemeyer, Tübingen 1981. 2nd unchanged edition 1990. ISBN 3-484-24008-3
  • Hans-Heinrich Wängler: Rank dictionary of high German colloquial language . Elwert, Marburg 1963.

Individual evidence

  1. ^ L. Hoffmann, RG Piotrowski: Contributions to language statistics. VEB Verlag Enzyklopädie, Leipzig 1979, pp. 185-189.
  2. ^ Lothar Hoffmann: Communication means technical language. An introduction. Second completely revised edition. Narr, Tübingen 1985, p. 126ff. ISBN 3-87808-771-3 .

See also

Frequency class

Web links

Wiktionary: Frequency dictionary  - explanations of meanings, word origins, synonyms, translations
Wiktionary: Frequency dictionary  - explanations of meanings, word origins, synonyms, translations