Readability index

from Wikipedia, the free encyclopedia

A readability index is a formula or a procedure that attempts to formally determine the readability of a text. It fulfills the function of a mathematical metric .

Procedure

The first readability formulas were developed for the English language , but they are also available for other languages ​​such as German, French, Spanish, Dutch, Danish and Swedish. In general, it can be said that all legibility formulas are language and text genre-specific. For example, the Flesch Reading Ease Index cannot be applied in unchanged form to German-language texts. However, it is possible to recalibrate this and other indices for other languages ​​and then use them appropriately.

There are over 200 procedures for the English language. The following list only covers the most popular:

Flesch Reading Ease

The readability index Flesch-Reading-Ease , also called Flesch-Grad , is a numerical value for readability that can be calculated from a text. The higher the value, the easier it is to understand the text. Text that is easy to understand has a value of around 60 to 70. The calculation of the Flesch Reading Ease is based on the English language. It is calculated using the following formula:

With:

  • ASL. The average sentence length ( average sentence length ) is given by the number of words in the text by the number of sentences of the text is divided.
  • ASW. The average number of syllables per word ( Average Number of Syllables per Word ) are calculated by dividing the number of syllables of the whole text is divided by the number of words in the text.

The process was developed by Rudolf Flesch .

Toni Amstad was able to transfer the formula to the German language. Above all, the word factor had to be recalculated, since the German words are on average longer than English words, while the sentences are about the same length. The definition of its formula:

The following classification or table shows a rough classification based on age and education.

Flesch Reading Ease Score
From ... to below ...
Readability Understandable for
0-30 Very difficult Academics
30-50 Heavy
50-60 Medium difficulty
60-70 medium 13-15 year old students
70-80 Medium easy
80-90 Light
90-100 Very easy 11 year old student

Flesch Kincaid Grade Level

Like the Gunning Fog Index (see below), this readability index attempts to express readability in terms of the number of years of school a reader must have completed to understand the text. It is tailored to the English language and the US school system. The Flesch-Kincaid-Grade-Level is calculated as follows:

ASL and ASW as explained under Flesch Reading Ease.

As you can easily recalculate, the record length in the FKGL has a greater influence on the index than in the FRE. In both indices, however, the word length dominates, which also explains the limited applicability to the German language with its many compound words.

Example:

The sentence, which is not too difficult to understand: " All my ducklings swim on the lake, their heads under the water, their tails up." Has 14 words and 22 syllables, so ASL corresponds to 14 and ASW to 1.57. The following readability index values ​​result:

Gunning Fog Index

The Fog Index developed by Robert Gunning also roughly indicates the number of school years a reader must have completed in order to understand the text. The Gunning Fog Index is also based on the English language in its calculation and on the US school system in its interpretation . It is calculated using the following formula:

with W: the number of words in the text, S: the number of sentences in the text, D: the number of all words in the text that have at least three syllables

The procedure and the exact criteria are in words:

  1. A text passage with a length of at least 100 words is selected and the exact number of words it contains is determined. Of course, the entire text can also be edited, which can, however, result in considerable additional work.
  2. The average sentence length is calculated by dividing the number of words by the number of sentences in the text passage.
  3. The number of words of three syllables or more per 100 words is determined. Proper names, combinations of short words and verbs that can only be added to three or more syllables by adding an ending are excluded from this count.
  4. The average sentence length and the number of words with three or more syllables are added and then multiplied by 0.4.

The result is the Gunning Fog Index.

Viennese factual formula

The Viennese factual text formula is used to calculate the legibility of German-language texts. It indicates for which school level a factual text is suitable. The scale starts at school level 4 and ends at 15, whereby from level 12 onwards one should speak of difficulty levels rather than school levels. A value of 4 therefore stands for very light text, whereas 15 indicates a very difficult text.

The formula was drawn up by Richard Bamberger and Erich Vanecek.

  • MS is the percentage of words with three or more syllables
  • SL is the mean sentence length (number of words),
  • IW is the percentage of words with more than six letters
  • ES is the percentage of monosyllabic words.

The first Viennese factual formula

The second Viennese factual formula

The third Viennese factual formula

The fourth Viennese factual formula ("with regard to the grade")

The example with the ducklings provides an index of with the first WSTF

For the theoretical justification

Readability formulas are largely established in research. Many who deal with readability formulas nevertheless ask themselves why one can obtain information about the legibility of texts if only very few criteria are taken into account. You can easily get the impression that word and sentence length should not be particularly valid criteria. But if you look at the other criteria with which these two mentioned - and others - are linked, you can see that although only two text properties are measured directly, a whole series of others are also indirectly taken into account.

Automatic determination of readability indices

Most of the early readability formulas were originally designed for manual evaluation. Hence the suggestion made in some publications to take samples of 100 words. The automatic determination of the legibility of texts is a field of language technology . Depending on the formula, different demands are made on a computer program. While the recognition of sentence boundaries as a prerequisite for counting sentences usually works reliably, the correct separation of the input text into words ( tokenization ) is often unclear. Even the counting of syllables can only be done approximately with the computer. Since humans also make mistakes, the original formulas are automatically adjusted in such a way that they more or less tolerate these mistakes. Computers, on the other hand, make different mistakes than humans, so the constants in the formulas should actually be adjusted. Newer formulas that were designed from the outset for automatic evaluation are less susceptible to this problem.

See also

literature

  • Richard Bamberger, Erich Vanecek: Reading - Understanding - Learning - Writing. Youth and People, Vienna; Diesterweg, Frankfurt 1984.
  • Klaus Merten: content analysis. Introduction to theory, method and practice. 2., verb. Westdeutscher Verlag, Opladen 1995, ISBN 3-531-11442-5 , p. 175 ff.
  • Jaan Mikk: Textbook: Research and Writing. Lang, Frankfurt a. a. 2000. ISBN 3-631-36335-4 .

Web links

There are various online tools to determine the readability index of a text.

Individual evidence

  1. Arend Mihm: Language-statistical criteria for the suitability of reading books. In: Linguistics and Didactics. 4, 1973, 117-127.
  2. WH DuBay: The Principles of Readability. Impact Information, Costa Mesa, California 2004 impact-information.com (PDF; 959 kB).
  3. ^ Rudolf Flesch: A New Readability Yardstick. In: Journal of Applied Psychology. 32, No. 3, 1948, pp. 221-233.
  4. Ralf Lisch , Jürgen Kriz : Basics and models of content analysis. Rowohlt, Reinbek 1978, ISBN 3-499-21117-3 , p. 180 ff.
  5. Toni Amstad: How understandable are our newspapers? University of Zurich: Dissertation 1978.
  6. ^ Robert Gunning: The Technique of Clear Writing. Revised Edition. London: McGraw-Hill, 1968, p. 38.
  7. Karl-Heinz Best: Are word and sentence length useful criteria for the legibility of texts? In: Sigurd Wichter, Albert Busch (Ed.): Knowledge transfer: Success control and feedback from practice . P. Lang, Frankfurt am Main 2006, ISBN 978-3-631-53671-1 , pp. 21-31 .
  8. ^ Niels Ott: Information Retrieval for Language Learning. An Exploration of Text Difficulty Measures Master's thesis in Computational Linguistics. Submitted to the Department of Linguistics, University of Tübingen, 2009.