Electronic text recognition

from Wikipedia, the free encyclopedia

The intelligent character recognition ( ICR ) is an extension of the optical character recognition (OCR). It is - roughly speaking - a handwriting recognition system that can use a computer to learn various forms of the medium during processing. ICR software is often based on adaptive programs that use artificial neural networks . Newly learned manuscripts find direct equivalents in the program's own database. This specially designed, electronic text recognition increases the application range of existing scanning devices for the processing of handwritten documents. OCR has so far only covered the area of ​​processing printed documents in a satisfactory manner. Since the process of recognizing handwriting is much more demanding, the accuracy that has been achieved up to now has been insufficient. With well-structured documents, hit rates of more than 97% are possible with ICR, and the software requires several passes using alternating methodology. Each of them is weighted differently in the evaluation. Even separate methods for number and letter recognition are used. An important step for the ICR was the development of the so-called automated forms processing in 1993. This describes a three-stage process for the recording of a character or an image by the ICR software. In the first step, a recording of the document is created. This recording is processed by the ICR software in the second step and finally evaluated automatically in the last step.

Electronic text recognition increases the efficiency of optical character recognition noticeably. ICR software is used in the business world, for example, to process handwritten forms. ICR solutions are available from the following manufacturers:

company product ICR supported languages
ABBYY ABBYY FlexiCapture
ABBYY FlexiCapture Engine
ABBYY FineReader Engine
Afrikaans, Albanian, Aymara, Azerbaijani (Latin), Basque, Bemba, Blackfoot, Breton, Bugotu, Bulgarian, Cebuano, Chamorro, Corsican, Crimean Tatar, Croatian, Crow, Czech, Dakota (Sioux), Dutch (Belgium), Dutch (Netherlands), English, Estonian, Straight, Evenk, Fiji, Finnish, French, Frisian, Friulian, Galician, Ganda, German, German (Luxembourg), German (new spelling), Greek, Guarani, Hani, Hausa, Hawaiian, Hungarian , Icelandic, Indonesian, Irish, Italian, Jingpo, Karachai-balkar, Kasub, Kawa, Kazakh, Kyrgyz, Congo, Kpelle, Kumyk, Kurdish, Latin, Latvian, Lithuanian, Luba, Malagasy, Malinke, Maori, Maya, Miao, Minangkabau , Mohawk, Moldavian, Mongol, Mordvin, Nahuatl, Nivkh, Nogay, Nyanja, Ojibwa, OldFrench, OldGerman, OldItalian, OldSpanish, Papiamento, Polish, Quechua, Romansh, Romanian, Romani, Rundi, Russian, Rwanda, Sami (Lapland), Samoa, Scottish Gaelic, Selkupian, Serbian (Latin), Slovak, Slovenian, Somali, Sotho, Spanish, Swahili, Swazi, Tagalog, Tahiti, Tok Pisin, Tonga, Tswana, Tun, Turkish, Uyghur (Latin), Ukrainian, Wolof, Xhosa, Zapotec, Ido, Interlingua
Accusoft SmartZone ICR / OCR German, English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish and Swedish (.NET all, ActiveX only English)
ExperVision TypeReader
OpenRTK
English, French, German, Italian, Spanish, Portuguese, Danish, Dutch, Swedish, Norwegian, Hungarian, Polish, Simplified Chinese, Traditional Chinese, Russian, Finnish, and Polynesian
IRIS Group IRISCapture Pro for Forms Latin based languages
LEADTOOLS LEADTOOLS ICR SDK modules Catalan, Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Polish, Portuguese, Swedish, Spanish
reRecognition Cadmos
Recogniform Recogniform
CharacTell SoftWriting

See also