TAUS Blog

Translation automation has recently experienced a major shakeup – the emergence of neural MT*. This marks the start of a new journey of exploration into the opportunities and limitations of machine learning (ML) in translation and language technology more generally.

TAUS has recently staged two successful webinars on datafication (Datafication in Europe Webinar Recording and Datafication in China Webinar Recording), one focused on data collection projects in Europe, the other on business developments around language data in China. Taken together, they revealed a striking contrast between the central role of public administration concerns in Europe and the rise of a private translation data marketplace in China.

With two TAUS Datafication webinars ahead (one covering Europe, the other China) in October followed by the TAUS Data Summit in early November, we think it’s worth quickly reviewing the state of play about data sharing opportunities in two different regions.

What is a Quality Manager in a translation company doing? Checking the quality of the translations, of course. That’s what one would think immediately. But just like many other jobs, the job of the Quality Manager is changing, and just like with other jobs the reason is: data. Here are five reasons why modern Quality Managers should be obsessed about data.

Tracking Translation Productivity: Yes or No?

Translation productivity tells you how fast a translation was completed. It’s used to profile translators and post-editors, to set prices, compare vendors, to categorize content or evaluate MT engine output.

However, not everybody agrees productivity is the best way to track a vendor’s performance, irrespective of this vendor being an LSP or a translator.

During the TAUS Annual Conference 2016 in October, Eric Bailey, Group Engineering Manager for the Global Service and Experiences team within Office at Microsoft, will host the session 'To Share Or Not To Share'? This blog post is written in preparation for this session.

Data entered the field of machine translation in the late eighties and early nineties when researchers at IBM’s Thomas J. Watson Research Center reported successes with their statistical approach to machine translation.

Until that time machine translation worked more or less the same way as human translators with grammars, dictionaries and transfer rules as the main tools. The syntactic and rule-based Machine Translation (MT) engines appealed much more to the imagination of linguistically trained translators, while the new pure data-driven MT engines with probabilistic models turned translation technology more into an alien threat for many translators. Not only because the quality of the output improved as more data were fed into the engines, but also because they could not reproduce or even conceive what really happened inside these machines.

I recently had the luck to participate in the BabelNet Workshop that was organized by the European Commission, the Publication Office and the European Parliament, in Luxembourg on the 2nd and 3rd of March.

In part I, we defined the pivot language approach, discussed briefly its major drawbacks, referred to factors regarding the selection of the pivot language and explored two areas where pivoting can be deployed i.e. the relay interpretation (oral) and the human translation (written), including translations from audio recordings with or without script. In part II of this blog article, we will discuss more areas where pivot languages can be deployed, namely in building and enhancing bilingual lexicons, translation memories, machine translation systems and machine transliteration systems.

A pivot language is a third or intermediate language that can bridge the gap between language pairs. For example, if there are translations between English to French and the same English to Spanish available, through the pivot language English, translations between French and Spanish can be generated.