Tatoeba

from Wikipedia, the free encyclopedia
Globe icon of the infobox
Tatoeba
Website logo
Collection of example sentences
languages 345 (as of October 2019)
items over 7.9 million (as of October 2019)
user over 44,500 (as of October 2019)
Registration not for use, only required for cooperation
On-line 2006 (currently active)
http://tatoeba.org/deu

Tatoeba is a project whose name comes from Japanese and means “for example”.

Tatoeba consists of a large inventory of exemplary sentences that have been translated into almost all available languages. It works as a multilingual translation dictionary , in which one does not find the translation of a word, but complete sentences in the authentic national language in which the word you are looking for occurs. Every registered user can add sentences as well as translate sentences. The text corpus is not free of errors, so every user can translate sentences in any language, regardless of whether he speaks the language or not. The sentence entries are gradually supplemented by sound files.

The text collection of Tatoeba is based on the corpus Tanaka , a large collection of parallel sentences in Japanese and English. Since 2006, many other languages ​​have been added under the direction of Trang Ho.

structure

Graphic structure Each node represents a sentence, and each arrow represents the connection between two sentences. When two sentences are directly linked, they have the same meaning.

The sentence collection is structured like a graph with nodes and arrows: each node represents a sentence, and each arrow represents the connection between two sentences. When two sentences are directly linked, they have the same meaning.

network

The network offers a number of ways to search for and edit sentences. Every registered user can add, translate, comment, add keywords and, if necessary, edit new sentences. The example sentences are arranged one below the other in all available languages.

Prices

Tatoeba received a grant from Mozilla Drumbeat in December 2010.

Some work on the Tatoeba infrastructure was funded by the Google Summer of Code, 2014.

In May 2018, the project received a $ 25,000 grant from the Mozilla Open Source Support (MOSS) program.

In August 2019, the project received a $ 15,000 grant from the Mozilla Open Source Support (MOSS) program.

statistics

At the end of October 2019, 345 languages ​​were represented. Of a total of over 7.9 million sentences, around 1,236,000 were in English and 312,000 in Spanish. German is in 6th place with 481,000 sentences.

Offline use

Tab-separated data from Tatoeba that can be used for import into Anki and similar software can be downloaded.

Web links

Individual evidence

  1. yoyodyne - Where the future begins tomorrow. »Best Drumbeat Projects: Tatoeba - a free and open database of sentences. January 2, 2011, accessed October 31, 2019 .
  2. Google Summer of Code 2014 Organization Association Tatoeba. Retrieved October 31, 2019 .
  3. ^ Trang: MOSS award for Tatoeba. Retrieved October 31, 2019 .
  4. ^ Trang: A second MOSS award. Retrieved November 1, 2019 .