With catalog enrichment (English catalog enrichment ) are cataloging records of a library catalog complemented by additional information about the regular formaldehyde and indexing beyond. These can be mere pictures of a book cover, tables of contents, abstracts, automatic indexing and translation derived from them, right through to the electronic full text of a book.

The additional content in the context of catalog enrichment can include: the information from tables of content (TOCs), the insertion of table of contents and abstracts , the inclusion of reviews , the transfer of full texts or recommendation services . In addition, cover images can be integrated. Another possible form are machine-generated descriptors based on computer linguistic and / or statistical extraction and modification processes based on digitized or imported texts such as TOC, abstract, full text or review.

Tagging, free descriptors added manually by users or descriptors adopted by publishers, are also common options. The so-called query expansion, the expansion of user queries with additional semantic resources ( thesaurus , ontology , classification , taxonomy , translation and other lists) is another option.

Implementation and history

Enrichment can be done either by including the information (tables of contents, table of contents, machine-generated descriptors, etc.) in the catalog entry itself or by linking, with the linked source either being on a library server or being made available by an external provider.

Many forms of catalog enrichment have been used by booksellers on the Internet for several years . At the beginning of the 1990s, the University Library of Düsseldorf expanded the number of searchable words by machine indexing of the titles and subtitles to include basic forms, compounding, synonyms and keywords from the German subject headings norm file. Catalog enrichment, in particular with tables of contents and machine-generated descriptors from them, has been state-of-the-art in German-speaking countries since 2002. The additional content can be queried in the catalog systems partly via machine-generated descriptors, partly also via mostly short texts transferred to the catalog.

Some libraries deliberately do not include the additional content in the search, in particular because most library systems do not have a relevance ranking and there is a fear that the number of hits will be too large for more general terms. However, this means that you lose the chance to use very specific technical terms or z. B. To be able to find authors of compilations, which are recognized and extracted by machine indexing (see also Named Entity Recognition, Information Extraction or Text Analytics).

In 2007, the German library associations agreed standards for the digitization of tables of contents and methods of exchange. The German National Library has been involved in the production of tables of contents since 2008.

In some American association catalogs, one encounters tables of contents (as content notes) whose contents can be found using the keyword search and in which, in the case of anthologies, the names of the article authors are also available using the author search. Google is tackling the subject even more intensely: at the end of 2008 it comes to an arrangement with US publishers and author organizations regarding copyrights and digitizes entire libraries and all of their works en masse and links them to the libraries registered there via OCLC WorldCat.


