Full text indexing

from Wikipedia, the free encyclopedia

Full text indexing is the (automatic) recording of all words in a text in an index . Stop words are usually excluded from this .

Full-text indexing is either complementary to intellectually assigned descriptors used or entirely serves as a substitute for an intellectual classification system .

A popular misconception is that automatically generated full-text indexes are generally better than human-made classification systems (intellectual classification systems). However, full text indexes can be used if there is insufficient time and money for a functioning classification system. Full-text indexes, even when paired with ranking algorithms, generally function as poorly suitable retrieval tools and more as support for intellectual classification systems. For example, Google has developed a process in which the links found are searched with the help of web crawlers and included in the search index. The pages are broken down according to search terms and keywords.

advantages

Indexing using full-text indexing results in an increase in the hit rate of a retrieval system , especially since the number of keywords is normally higher.

In principle, the researcher can search without knowledge of the classification system.

The full text index can serve as a supplement to an intellectual classification system.

disadvantage

The hit accuracy of a retrieval system decreases enormously through the use of full text indexing. If a term is only mentioned in passing in a text, the document can still be found under it.

Searching the full-text index takes longer because full-text indexes are inherently larger than those created intellectually. In relation to a single search, this may be in the range of fractions of a second, but it can no longer be neglected as the number of users of a retrieval system increases.

See also