Spell check

from Wikipedia, the free encyclopedia
Monitor display (for manual correction)

Spell check refers to software-based procedures for correcting spelling or typing errors in electronically available texts. Purposes are the classic office programs , the previously used word processing programs and typewriters with electronic display, which allow printing after the entire text has been completed. Databases , web editors , e-mail clients , instant messengers , search engines and numerous other programs are now being equipped with integrated spell checkers. Linguistic errors are either highlighted in color and can be corrected manually from certain suggestions or are corrected automatically. ( Auto correction )

Spell check earlier

Until the 1990s, spell checking was often done using simple word lists with which the written text was compared. This does not require complex algorithms that would have required too much computing time in the early days of the computer. The disadvantages are numerous. On the one hand, compound words are generally recognized as errors unless they are also in the word list. Words with prefixes and suffixes , particles , etc. must also be in the word list in order not to be marked as errors. The limited memory required compromises; common compositions were taken into account, rare ones not. However, other words such as compound nouns without a hyphen are incorrectly classified as correct.

Spell check today

With the increasing computing power of today's computers, better spelling checks are possible, often even with checking of grammar and word division .

Spelling mistakes

  • Recognition of letter sequences (including strings , Eng. Strings ) which are not the vocabulary of the current language belong. Examples of the word Fehler : Feler (omission), Fehle t (replacement), Fehl w er (insertion), Feh el r (exchange)
→ This case can be corrected with relatively simple means. The correction program compares character strings that cannot be found in the dictionary with the dictionary entries and selects those as correction suggestions that are most similar to the erroneous character string (the word). The editing distance (also Levenshtein distance ) between the erroneous sequence and the suggested correction is minimal, which means that the erroneous word can be transferred to the suggested correction with as few changes as possible.

Another example:

  • Wrong word: Libe ; in the lexicon: love , body . Levenshtein distances: for love : 1 (one omission), for body : 2 (two exchanges) → first suggestion for correction: love

If the automatic spell checker suggests a wrong word and this is then adopted, one speaks of the Cupertino effect .

Grammatical mistakes

  • Recognition of words that exist but lead to a grammatical error when used . Example: Make your guest book.
→ This case cannot be found by a pure, word-related spell check, but it can be found by a grammatical check of the sentence, since here a possessive article in the masculine (“dein en ”) is applied to a neuter (“book”).

Semantic errors

  • Words that exist but are in the wrong context. Example: The hammer eats grass.
→ This case is not found by the first two corrective measures. A semantic test of the text would be necessary here; a functionality that is not available in conventional word processing programs.

Compounds

Difficulties also arise, especially with German texts, due to compound words . In order not to let the rate of incorrectly declared expressions get too high, modern spelling programs also accept compositions that are not in standard dictionaries (e.g. "educational misery"). A disadvantage of this method is that sometimes semantically nonsensical compositions (e.g. "foreground", "skin messages") are no longer shown as errors.

See also

  • Ispell (International spell): Standard software for spell checking under Unix
  • GNU Aspell (Ispell successor): Free spell checking software for Unix-like systems and Windows
  • Hunspell : Free and platform-independent software for spell checking