Tatoeba is a project that aims to collect lots of sentences translated in several languages. In this blog you will find, among other things, news and documentation about it.

Thursday, October 14, 2010

Some stats

I normally tweet whenever a language is reaches an important milestone but I was a bit absent from Tatoeba the past 4 weeks and I didn't really keep track of the progress of each language. So I'm going to sum up everything in this blog post and, while I'm at it, give more general stats about Tatoeba.

New languages

We've added several new languages since September. Tatoeba is now supporting a total of 71 languages. The new languages are:

Bosnian

Croatian

Old East Slavic

Chamorro

Tagalog

Quechua

Mongolian

Lithuanian

Sentences stats

Top 5

English - 156,000+ sentences. English has taken the first place back in September and things still haven't changed.

Japanese - 153,000+ sentences.

French - 50,000+ sentences. Around 10,000 sentences were added within 2 months. There's progress :) It had taken 3 months to go from 30,000 to 40,000.