Google Translate and the future of voice

Matt Warman examines the new 'Conversation Mode' for Google Translate
for Android, and asks what's next for the search giant

On Monday Google announced that users of a new version of one of its mobile phone apps will now have access to the solution to every Sudoku puzzle in the world. Last night, the company revealed that it will offer mobile voice translation on its Android smartphone. So in short Google has now solved Sudoku and Spanish.

Readers of Douglas Adams’s Hitchhiker’s Guide to the Galaxy will recognise the idea of the Babel fish – one device (or fish) that sits in the ear of a listener and instantly translates any language into your mother tongue. Google is not quite there yet, but it is perfectly reasonable to see that as the logical end of the road the company has embarked upon.

Eric Schmidt, Google’s chief executive, demonstrated the experimental technology at the IFA electronics fair in Berlin in September (using German rather than Spanish). He estimates that in a year to 18 months, the process will be instantaneous.

As the company’s mobile product manager points out: “Technology becomes really exciting when it becomes invisible.” For now, Google’s Awaneesh Verma says that it is launching what he modestly calls “a new interface within Google Translate that’s optimised to allow you to communicate fluidly with a nearby person in another language”.

He continues: “In conversation mode, simply press the microphone for your language and start speaking. Google Translate will translate your speech and read the translation out loud. Your conversation partner can then respond in their language, and you’ll hear the translation spoken back to you.”

Related Articles

While that may sound simple, the technology behind such a service, offered free, was impossible just a few years ago. The translation itself takes Google a fraction of a second, but transmitting it to your phone takes much longer.

There are, of course, a number of caveats: features such as this are usually released to the public in ''beta’’ form, to allow testing and the ironing out of errors by a large number of users. Translate Conversation, by contrast, is still in ''alpha’’. Google is saying, basically, we know this doesn’t work yet. “Factors like regional accents, background noise or rapid speech may make it difficult to understand what you’re saying,” says Verma.

Chewy Trehella, a new business development manager at Google, says that while the service will be “a bit patchy at best”, the early release is justified by the fact that users will be able to refine it themselves, by correcting Google’s suggestions, and then the whole service will benefit.

Currently, Translate supports 53 languages for text input and 15 for voice input. Although only Spanish is available in Conversation mode, these 15 are the obvious next steps, and there are a host of other uses for translation either in the works or already in use.

Google’s method is unlike traditional approaches: because its background is in web search, it typically approaches languages as a search problem, too. So where standard texts are available in a number of languages (from the European Parliament, for instance), Google is able to index those and augment the traditional dictionary to make translation more accurate.

This is a machine-based halfway house between the expertise of a human translator and the typical idiocy of a computer-based one. The roots of this method come from the UN, whose output is produced in six languages: Google took their corpus of work to produce a massive set of data which it can use for statistical comparison.

The downside is that it is, in some ways, unrelated to meaning: where a human translator may get the gist but not the nuances, Google’s new version, pioneered by Franz Josef Och, may sometimes produce nonsense. When it works, however, Translation services on this scale can be life-changing. It’s particularly useful for countries where there is very little material in their language (so giving the Arab world, for instance, access to Wikipedia).

The basic translation of maps, too, means that Chinese who don’t read Roman text can still get around London. Captions on YouTube videos are now added automatically in a range of languages, and users of the Google browser Chrome can automatically have web pages translated out of their original languages.

There are major economic implications for this development. While language students with an eye on jobs as translators might initially want to consider working as software engineers, in fact the refinement of Google’s machine translation method depends on a constantly produced body of work by human beings too. The requirement for this, with the exception of new words, is however likely to diminish over time.

The consensus is that Google, or computers in general, will not be able to do to languages what they have done to chess.

When it comes to simply understanding a document or a sentence, machines will often be faster and more accessible than a highly paid human expert. But a polished translation of a novel – idioms and all – seems a long way off. Computers will not have taste for quite some time yet.

As Trehella says: “I wouldn’t use this for a business conversation – yet.”