Categories

DARPA produces software that translates thousands of Arabic documents every day using their own state of the art machine translation technology. But as communication shifts from written documents (which tend to follow formal Arabic grammar conventions) to social media such as twitter, instant messages, and blog posts (which are less formal, have less context and have less strict grammar conventions because they are written in regional dialects), they needed to adapt their translation system. They needed to generate a library of thousands of translations from ???Social??? Arabic to English and turned to Mechanical Turk to help with this problem.

DARPA compiled a library of messages from social media sources in Arabic. They asked Workers to translate these from Arabic to English. They opted for Mechanical Turk because using professional translators would cost too much and take too long.

Mechanical Turk Workers translated 1.5 million words of Arabic, which allowed DARPA to build their social media translation database in 8 weeks for about one tenth the cost of using professional translators.

“DARPA produces software that translates thousands of Arabic documents every day using their own state of the art machine translation technology. But as communication shifts from written documents (which tend to follow formal Arabic grammar conventions) to social media such as twitter, instant messages, and blog posts (which are less formal, have less context and have less strict grammar conventions because they are written in regional dialects), they needed to adapt their translation system. They needed to generate a library of thousands of translations from ???Social??? Arabic to English and turned to Mechanical Turk to help with this problem. [..] Mechanical Turk Workers translated 1.5 million words of Arabic, which allowed DARPA to build their social media translation database in 8 weeks for about one tenth the cost of using professional translators.”