Tatoeba
Tatoeba | |
---|---|
Collection of example sentences | |
languages | 345 (as of October 2019) |
items | over 7.9 million (as of October 2019) |
user | over 44,500 (as of October 2019) |
Registration | not for use, only required for cooperation |
On-line | 2006 (currently active) |
http://tatoeba.org/deu |
Tatoeba is a project whose name comes from Japanese and means “for example”.
Tatoeba consists of a large inventory of exemplary sentences that have been translated into almost all available languages. It works as a multilingual translation dictionary , in which one does not find the translation of a word, but complete sentences in the authentic national language in which the word you are looking for occurs. Every registered user can add sentences as well as translate sentences. The text corpus is not free of errors, so every user can translate sentences in any language, regardless of whether he speaks the language or not. The sentence entries are gradually supplemented by sound files.
The text collection of Tatoeba is based on the corpus Tanaka , a large collection of parallel sentences in Japanese and English. Since 2006, many other languages have been added under the direction of Trang Ho.
structure
The sentence collection is structured like a graph with nodes and arrows: each node represents a sentence, and each arrow represents the connection between two sentences. When two sentences are directly linked, they have the same meaning.
network
The network offers a number of ways to search for and edit sentences. Every registered user can add, translate, comment, add keywords and, if necessary, edit new sentences. The example sentences are arranged one below the other in all available languages.
Prices
Tatoeba received a grant from Mozilla Drumbeat in December 2010.
Some work on the Tatoeba infrastructure was funded by the Google Summer of Code, 2014.
In May 2018, the project received a $ 25,000 grant from the Mozilla Open Source Support (MOSS) program.
In August 2019, the project received a $ 15,000 grant from the Mozilla Open Source Support (MOSS) program.
statistics
At the end of October 2019, 345 languages were represented. Of a total of over 7.9 million sentences, around 1,236,000 were in English and 312,000 in Spanish. German is in 6th place with 481,000 sentences.
Offline use
Tab-separated data from Tatoeba that can be used for import into Anki and similar software can be downloaded.
Web links
Individual evidence
- ↑ yoyodyne - Where the future begins tomorrow. »Best Drumbeat Projects: Tatoeba - a free and open database of sentences. January 2, 2011, accessed October 31, 2019 .
- ↑ Google Summer of Code 2014 Organization Association Tatoeba. Retrieved October 31, 2019 .
- ^ Trang: MOSS award for Tatoeba. Retrieved October 31, 2019 .
- ^ Trang: A second MOSS award. Retrieved November 1, 2019 .