KSTEM

from Wikipedia, the free encyclopedia

The KSTEM algorithm is an algorithm from the field of computational linguistics for automatically tracing words back to their word stem ( stemming ). The algorithm developed by Robert Krovetz is based on morphological rules and a stem lexicon, with the help of which he tries to avoid incorrect stemming. KSTEM removes suffixes from a word until it finds the word form reduced by rules in the lexicon. Only a few suffixes are removed if the new word to be stemmed is not in the dictionary. Word forms that are found in the lexicon are not chiselled because it is assumed that they cannot be further derived.

literature

  • R. Krovetz: Viewing Morphology as an Inference Process . In: Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 191-203, 1993 [1]