Archives for spoken German

from Wikipedia, the free encyclopedia

The Archive for Spoken German (AGD) , until 2004 the German Language Archive , belongs to the Institute for the German Language in Mannheim and is the central documentation center for spoken German . The archive takes over the corpora of spoken German created in language surveys and research projects and makes them available to research for further scientific evaluation.

The digitization of the entire inventory and the publication of documentation ("inventory catalog") and transcripts on the Internet in the " Database for Spoken German " play a central role , as far as data protection and copyright allow this.

The AGD is currently collecting the “Research and Teaching Corpus Spoken German” (FOLK), a national reference corpus of conversations in German. Part of the work in FOLK is also the development of suitable corpus technology for corpora of spoken language. This includes, for example, the FOLKER transcription software.


The AGD currently offers 40 corpora , i.e. individual collections, for use. The total inventory comprises over 90 corpora with audio and video recordings with a total duration of almost 10,000 hours.

The inventory includes the following types of corpora:

Variation corpora that document varieties (dialects, regional colloquial languages) in the German-speaking area, e.g. B.

  • the corpus "German dialects", also known as "Zwirner corpus", which documents dialects primarily in the area of ​​the old Federal Republic of Germany.
  • the corpus "German dialects: GDR", which documents dialects in the area of ​​the former German Democratic Republic.
  • the corpus "German colloquial languages", the oral colloquial language primarily in the area of ​​the old Federal Republic of Germany.
  • the corpus "Deutsch heute", which documents the variation in regional standards of use at the beginning of the 21st century

Corpora that document spoken German in other areas of the world, e.g. B.

  • the corpus "Australian German"
  • the corpus "Russian-German dialects"
  • the corpus "German in Namibia"
  • biographical interviews of German-speaking emigrants in Israel (corpus "Emigrantendeutsch in Israel" from 1989)

Conversation copy, e.g. B.

  • the research and teaching corpus Spoken German (FOLK)
  • the corpus "Spoken Scientific Language Contrastive" (GEWISS)
  • the "Berliner Wendekorpus" from 1992
  • Advisory and arbitration talks (corpora BG and SG from 1979/1983)

Evaluation options

Parts of the AGD (approx. 19,000 recordings from 34 corpora) are made available to the academic public online via the database for Spoken German (DGD2). The DGD2 offers the possibility to browse recordings, transcripts and metadata and to search them specifically.


The archive was founded under the name “Deutsches Spracharchiv” by Eberhard Zwirner in 1932 in Berlin . Zwirner's 5857 voice recordings in around 1000 German-speaking locations in the 1960s are still one of the archive's core collections. In 1979 the archive was moved to the IDS in Mannheim. In order to emphasize the focus of the archive on the spoken language, it was renamed the Archive for Spoken German in 2004.

Phonai and Phonotek book series

The archive first published the book series Lautbibliothek der Deutschen Mundarten and since 1969 the Phonai series , which is published by Max-Niemeyer-Verlag in Tübingen. The books are accompanied by sound recordings on CDs from the Phonotek in the archive.

See also


  • Stift, Ulf-Michael and Schmidt, Thomas: Oral corpora at the IDS: From the German language archive to the database for spoken German. In: Institute for German Language (Ed.): Views and Insights. 50 years of the Institute for the German Language. Editing: Melanie Steine, Franz Josef Berens. Pp. 360–375, Institute for the German Language, Mannheim 2014.

Web links

Individual evidence

  1. Stephen Wolf: Idioms and dialects for posterity. In: Speyerer Morgenpost, from November 5, 2008, page 6.
  2. ( Memento from August 15, 2007 in the Internet Archive )
  3. ( Memento from August 15, 2007 in the Internet Archive )