Retro conversion

from Wikipedia, the free encyclopedia

The retro-conversion ( "retrospective conversion") is an operation in libraries and archives .

In libraries, old written library catalogs are digitized with the help of scanners and then made available to library users as image files or full texts via the Internet. Media that are not listed in the new catalog (mostly OPACs ) can also be re-entered in it; this is called retro-cataloging . Retro-cataloging is more complex and expensive, but has the advantage over retro-digitization that all media are then listed in the same catalog.

Justification of retroconversion

Library catalogs have been in existence since the 17th and 18th centuries. Century on paper. This state of affairs lasted until libraries introduced electronic catalog systems that manage documents electronically and without paper. This development began in the late 1980s and is still not fully completed. After the introduction of library systems, library users found a catalog landscape divided into two parts: old works are recorded in card catalogs, but works that have been acquired since the introduction of electronic systems are only recorded electronically. Since the electronic library systems can be searched easily and at any time from any Internet connection via Internet portals (OPACs), but card catalogs in the library building have to be physically accessed, the following development resulted: Electronically documented works are often borrowed (read), while older works are hardly ever be used. After a catalog conversion has been carried out, the use of old book stocks increases by leaps and bounds. In the digital age, only what is digitally available seems to exist for users, searching through card catalogs is too cumbersome, and the works listed in them lead a shadowy existence. Accordingly, retroconversions make sense in terms of educational policy and are B. in Germany financially supported by the German Research Foundation (DFG).

There are roughly two methodological and two technical approaches:

Methodical approaches

  1. Text capture: The more or less complete transfer of the catalog data into a text-oriented representation. To do this, you either write the card information manually or use an OCR program for implementation.
  2. Image Indexing (image indexing): The detection of the individual catalog cards as an image file per scan. In addition, a periodic number of cards (for example every 50th) must be manually indexed (in alphabetical catalogs usually the author , in systematic catalogs usually the system ). Image indexing is rarely carried out nowadays, as it only allows very limited (online) research for content. Image indexing is chosen solely because of its significantly lower cost compared to text capture.

Technical procedures

  1. Online conversion: Most of the large academic libraries are members of a library network . In the online conversion process, the contents of the documents to be converted (usually library card catalogs or archival finding aids ) are entered directly into the library network's database via a permanent internet connection. In the last few years this online procedure has been used almost exclusively. It is slightly more cost-intensive than the subsequent offline process, but has clear technical advantages.
  2. Offline conversion: In this process, the data of the documents to be converted are first converted into a database. The data generated in this way is later loaded into the library and the archive into the electronic administration system in a separate operation. With this method, no internet connection is necessary between the workstation of the employee performing the conversion and the electronic administration system of the library or archive. This procedure was still used before the year 2000. Since the spread of broadband Internet connections, it is hardly used any more.

The advantage of text entry is the possibility of full-text search over the entire content, with a structured entry also over individual characteristics such as author, title or keyword. The disadvantage is the high cost, since the text capture - despite the powerful optical character recognition (OCR) - has to be carried out manually, which is very labor-intensive. OCR applications still generate intolerably high error rates and are not able to structure the data on the original documents (e.g. recognizing where the author's name begins or ends on an original document).

The advantage of image indexing is the comparatively low cost, since scanning is largely automated.

The advantage of online conversion lies in the fact that each document only has to be entered once in a library network. Since numerous libraries or archives are members of the same network, libraries or archives that are added later can also use existing data records (so-called "attachment" to existing recordings). There is no need to enter the same data set multiple times (efficiency gain, cost effect). In addition, there is the possibility of linking with existing authority data such as the GND . Finally, the simple possibility of hierarchical links should be mentioned, for example the presentation of multi-volume works (main titles and volume sets) or series of publications in the electronic catalog.

The main advantage of offline conversion is lower costs. The offline conversion also enables so-called double keying . Document content is recorded once by two different people. Then let the computer compare the two versions. Differences must be typographical errors, which are then corrected manually by a third person. In this way, almost error-free text is created (error rates below 0.02% = two errors per 10,000 characters). For the reasons mentioned above, this method is used extremely rarely.

literature

Web links