Intelligent Word Recognition

from Wikipedia, the free encyclopedia

Intelligent Word Recognition (IWR) describes a side branch of OCR technology ( text recognition using pattern recognition processes ). It should recognize fonts that cannot be translated using conventional OCR processes, such as flowing manuscripts, signatures, Arabic script, etc.

Conventional OCR translates single characters that are delivered to the OCR engine via image preprocessing steps. However, related fonts cannot be reliably broken down into individual parts as possible letter candidates. IWR combines several methods and compares the results with a dictionary.

Analysis of the entire word : The letters that make up a word give a word a characteristic outline. Dictionaries stored in classifiers provide possible word candidates.

Breakdown of possible letters or syllables : Words can be broken down at characteristic places. Here, too, classifiers provide possible syllable candidates.

Dictionary comparison : The combination of the results by characteristic outlines and word parts minimize the number of candidates in the main dictionary .

Usability and limits of technology

The usability of this technology is limited to clearly defined field areas. For example, it was developed by a French manufacturer of text recognition systems for check reading systems, the layout of which contains a field for the written payment amount, which enables continuous text.

A restricted word list is available for this field, so that comprehensively trained classifiers can deliver reliable results.

The limits of IWR are given by the fact that manuscripts vary greatly. And the larger the underlying dictionary, the greater the likelihood that unambiguous results will no longer be available.

See also