Posted
by
ScuttleMonkey
on Monday February 08, 2010 @05:12PM
from the ford-why-is-this-fish-in-my-ear dept.

nikki4 writes to tell us that in giving some major improvement tweaks to its existing voice recognition tool for the Smartphone, Google is aiming for new translator software that will provide instant translation of foreign languages. "The company has already created an automatic system for translating text on computers, which is being honed by scanning millions of multi-lingual websites and documents. So far it covers 52 languages, adding Haitian Creole last week. Google also has a voice recognition system that enables phone users to conduct web searches by speaking commands into their phones rather than typing them in. Now it is working on combining the two technologies to produce software capable of understanding a caller’s voice and translating it into a synthetic equivalent in a foreign language."

The problem is primarily things like diction. You can "train" someone sitting in front of a computer to speak slowly and clearly with good diction. Fine.

The problem is the most useful use model for a cell phone translator would be getting a cab or walking into a store. You talk into your phone and it says something to the other person in their language - wonderful, because you have "trained" yourself to speak clearly and slowly with good diction.

Then the other person mumbles something back at you in their language that neither you or the cell phone can make heads or tails out of. You can't "train" them so it will never work for that.

From my limited experience, English has its share of strange accents and such but in large measure people can speak with good diction and pronounciation. Lots of non-English languages seem to promote far less clarity and human-to-human it doesn't really impair communication that much. Human-to-machine is a whole different story and we are very far away from being able to do speech recognition with poor pronounciation and poor diction.

Google is developing software, the first foreign language translation of a phone almost immediately - Hitchhiker Guide's may sound like a fish galaxy.

Building on existing technology, speech recognition and automatic translation by Google is expected to have a basic system ready in a few years time. If successful, it's finally over 6000 languages in the world can be translated into the interaction between.

The company has set up an automated system, more than 1 million text translation of multilingual websites and computer scanning of documents are silent. So far in the 52 languages, along with last week's cover, Haitian Creole.

Google also has a voice recognition system that mobile phone users to order their mobile phones to talk, rather than type them in. Steering allows Web searches

LinksFear, Google and a coalition Spiveyr MainVillage mob obstructed Google Street View CarNow, these two software for the caller's voice is to understand the joint production technology, and a foreign language into a synthetic equivalent. Like a professional human translation, cell-phone speech "package" analyze, listen to lectures, the words and phrases until it understands the full meaning, and then try to translate.,, Translation service, Google's head, Franz Och "We think speech, voice translation as a few years time as possible to work should be appropriate 'said.

"Obviously, it's easy work, you need to combine high precision machine translation and speech recognition accuracy, which is currently what we are doing.

"If you see progress, and machine translation, speech recognition, the same advances recently there has been significant progress."

While automatic text translation, it is very effective, voice recognition to prove more challenging.

, Och "everyone a different voice, accent and tone is" said. "While recognizing that the mobile phone, as they should be effective by nature you personally. Phone should feel your voice last a voice search query, for example."

Translation software may be more accurate and use it. Translation system using crude Though some regulations - based on language syntax, Google their vast database, website, and translation of documents for use to improve the accuracy of your system.

"We have more data entry, quality, good," Och said. There is no shortage of help. "Many are language enthusiasts," he said.

However, some experts believe that life is still high barriers are translated. , Honorary professor of linguistics, David Crystal, Bangor University, said: "The problem with voice recognition is a difference of accent. System currently can not handle.

"Maybe Google will quickly than others to access, but I think this is not possible, in the next few years we will have a speech tool can handle a high speed cannot Glasgow.

"In the future, but it looks very interesting. If you have a noisy fish, learning a foreign language should be deleted."

Milky Way galaxy, the small, yellow for any type of sound fish language translation capabilities, the Travel Guide in Cannes kept. It started a bloody war, because everyone other person can understand speech.

"Hi, Stephen, it’s Natasha from BBC Newsnight in London. Just to say I’ve sent you two texts. One is to say that we could do it at eleven am your time after the launch, or any time sooner after the launch, or we could do it at midday as we suggested earlier. I, er, if you could text me back about that, and I’ve sent you the details of Skype that you need to do too. If you could give me a call back. Enjoy the launch and I’ll speak to you after that. Thank you Bye."

I’ve transcribed it from the voicemail sound file that resides online on my inbox on the Google Voice site. All fine. I have also ticked the option for Google Voice to send me a text transcript of any voicemail. Below is their interpretation of Natasha’s message it’s rather endearing how hopelessly wrong the largest company on earth gets it.

"Hi Stephen. It’s Jeff from BBC needs in nuns. And just to say I sent 80 tax, one, if to say we could do it. I left in i a m your time off to go into any time soon, or the court and full we could grab me today as we suggested at. A. F. I. If you could text me back byebye. I’ve sent you the details of skylights that you need to 3 T if you could give me a call. Bye. Enjoy the loans. I’ll speak to you after that. Thank you. Bye"

On a more serious note, such transcripts at least allow you to get an idea of the rough content and tone of a message without having to stop and listen to it, a much more concentration-intensive task.

Yes, but it doesn't actually work yet. Not even close. I can't even call the local taco place with Google's voice search on my Blackberry. It's a joke, and correcting it takes MUCH longer than keying in "baja fresh" with the chicklet keyboard.

It is really easy to make fun of translate.google.com based on how it translates Chinese to English. This is quite silly IMHO, as Chinese is possibly the hardest language in the world. (Travel around China and you'll find semi-literate taxi drivers, even in the major cities.[*]) This is a good article on why Chinese is hard: http://www.pinyin.info/readings/texts/moser.html [pinyin.info].

A better example would be say Dutch. Translate the OP from English to Dutch and back to English (i.e. a worst case scenario), and you end up with this:

"The company has an automatic system for translating texts on computers, sweetened by scanning millions of multilingual websites and documents. Until now includes 52 languages, adding Haitian Creole last week. Google has a system telephone speech recognition that allows users to query websites by speaking commands into their phones instead of typing them in. Now it is working on combining the two technologies to software to understand voice of a caller and translating it into a synthetic equivalent in a foreign language to produce. "

This is perfectly legible to me, and vastly better than what you got when babelfish was introduced 11 years ago. There is a good TechTalk about the topic at http://www.youtube.com/watch?v=y_PzPDRPwlA [youtube.com] which should be required viewing before making fun of google's machine translation efforts.

Voice recognition is harder, but for continuous untrained speech recognition google voice is pretty cool - I've gotten some barely intelligible voice messages on my google voice number, and where google voice is sure (i.e. black text) it is 95%+ correct, where it is not sure it is maybe 30% correct, but for another 30% it is not possible to figure out what was said, except when taking context into consideration. Google Voice transcribing a call from a mobile phone is better than what you got with Dragon Dictate 5 years ago even with a good microphone, so it is not unlikely that in a few years it will be better than naive human transcription. Humans will be better at guessing based on context thought.

I tried to translate your sentence multiple times, then back to English so I could post the ridiculous result.

Except google's translation was actually pretty good.

Try a more complicated example. For instance, starting here:

"It's probably pretty good at translating translations it produces back into the same source text. If you figure that a phrase structure in one language corresponds to a certain data structure in Google Translate, then it makes sense that this data structure would survive multiple passes through the same restructuring algorithm..."

translating to Japanese and back to English yields this:

"It is translated to produce translated text back to the very same source is probably a good thing. Cases, one single phrase structure of language specific data structures in the Google translation, it is this data structure makes sense and survival of multiple paths through the same algorithm structure corresponding figures..."

Here you've got badly handled idiomatic phrases all around... Like the Google translation to Japanese used "seiseisuru honyaku no honyaku dewa ii koto da" at the end of the first sentence ("created-translation's translation is good" or something like that). On the translation back the connection between "good" and "translation" was lost - Google slapped on a fairly generic "is probably a good thing" - picking the bit of uncertainty out of the start of the Japanese sentence and combining that with the "dewa ii koto da" - but dropping the whole idea of what it is that's good... Which is something that can be kind of vague in the structure of Japanese... Meanwhile, the phrase "source text" was transliterated into katakana, but it got broken up in the translation back to English and wound up in two different locations in the sentence...

The whole conditional clause in the second sentence got kind of mangled. In the Japanese translation it starts with "baai wa": baai means "case" or "situation" - the structure of the sentence establishes this "case" being described as a possibility... Google lost all that, and just said "cases," Then, at the end of the sentence, after the ellipsis, "figure", from "if you figure" in the English original, was tacked on as "taiousuru zu" - "interacting drawing" or "interacting figures". In the return-to-English version this somehow wound up back before the ellipsis again.

The rest of the second sentence in Japanese is something like "if this data structure uses the same intermediary algorithm, several passes of the algorithm should be survived and it should make sense." The apparent problem there is something analogous to operator precedence in arithmetic. The "and" is meant to mean that the surviving translation should still make sense - but this clause apparently got broken up... like the reverse translation assumed that "uses the same intermediary algorithm... should be survived" was all one stand-alone clause - and so it assumed that clause had nothing to do with "this data structure", switched the order of the "and" around, etc...

My hobby is building Gundam models - one of the most comprehensive review sites for new Gundam kits is in Korean. Believe me, we all try using Google translate or Babelfish on Dalong's site from time to time, but the result is rarely worth the effort.

Google will get better and better at parroting good translating and interpreting decisions, but software will never be able to make those decisions, because, in the final analysis, they are subjective decisions.

Think about how successful google has been with search. Prior to the web, we would have idealized search as speaking with an expert who has all the knowledge that exists on the web. Various efforts still strive for that vision today (askjeeves, wolphramalpha, etc). But clearly it is unreachable for the forseeable future. Yet, search is very useful.

Similarly, this universal translator may well reach a point that it is possible to visit a place, buy things, have a meal, ask where the toilets are, and get back home, particularly when both parties in the conversation are familiar with the limitations of translation. That would be extremely useful, even if it's only 1/100 of all a native bilingual speaker understands, or what you would need for nuanced treaty negotiations or to author a respectable translation of War and Peace.

In Chinese, then back to English:

Think about how the success has been with Google search. Prior to site, we will work with specialists who have all the knowledge and presence on the network to speak idealized search. However, efforts to fight the idea, (it is by virtue of, wolphramalpha, etc.). But obviously can not access the foreseeable future. However, the search is very useful.

Similarly, the universal translator is likely to reach a point of view, is that we can visit places, buy things, eat a meal, and asked where the toilets and get back home, especially when the parties are familiar with the limitations of dialogue and translation. This will be very useful, even if only 1 / 100 of the machine for all those who understand the bilingual, or you need to nuanced negotiation of a treaty, or the author's respect for the translation of war and peace.

A less 19th century European perspective might be that the Chinese mandated the continuity of their literary tradition, and thus words used 2000 years before still needed to be mastered. Of course this was difficult, but this was also in a culture where scholars memorized the Confucian classics as children. The scholar class had the job of studying and passing on literature, just as the Brahmans in India had the difficult task of memorizing the Vedas precisely. Or how Buddhist monks memorized massive sutras in recitation. In cultures such as this that have extremely old languages, the method of learning language and their use was utterly different. I don't think it should be looked at with a Marxist upper class vs. lower class dichotomy, which ignores all the practical matters involved with transmitting culture.

(Disclaimer: I happen to work for Google, but not on anything related to machine translation.)

You are demonstrating his very point. Translation will not get nuanced stuff, but it could greatly help everyday interactions for travelers or recent immigrants.

Let's do English->Chinese->English on his actual examples:

buy things, "What is the cost of this umbrella?" -> "What is the cost of this umbrella?" (Note I didn't say "how much for" because I am familiar with the limitations of translation and know that phrase is a colloquialism.)

have a meal, "I would like to order the beef soup." -> "I would like to order beef soup."

ask where the toilets are, "Where is your bathroom?" -> "Where is the bathroom?"

get back home "How do I get to the Hilton hotel?" -> "How do I get in the Hilton Hotel?"

I'd say that is pretty passable. Now, it would be better if folks could learn the local language, but for anyone who travels a lot you realize that it is not practical to learn a new language every single trip. Something like this might also help more folks travel with a little less fear, and experience places they otherwise wouldn't. Tools such as this could also allow older immigrants more access to the country they now live in.

Machine translation has a long long way to go to even be considered "good", but having something close to the state or the art, working on improving it, and making it free for all to use seems like a good thing to me.