If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Funny Google Translate Errors

The other day, John and Daniel were talking about the power of Google Translate, see this. So I got the idea of trying to 'fool' the translator by entering non-English idiomatic expressions. Google got them all wrong. Here's the funniest error:
French: Il prit ses jambes à son cou (= he took off in a hurry) --> English: He took her legs around his neck.
I wonder what Google's thoughts were dwelling on when it translated the French sentence.

(The literal meaning of the French sentence is 'he put his legs (feet) in his neck'. That's what you do when you are running very fast: you put the 'back' of your feet in your neck).

Google Translate is entirely based on distributional statistics and data, with no linguistic "logic" behind it. This results in a significant frequency bias: frequent words and phrases translate well (into their frequent counterparts in another language) while infrequent things don't. In the case you mentioned that's 1) infrequent at a phrasal level (though maybe not "1 in a million") and 2) very infrequent for the general occurrence of those words within it (the collocation isn't obvious to Google).
So for what it does, Google is doing a good job. Note that the translation you got is a perfectly good translation, in the right context. (Another way of looking at it is that the French sentence would be a good translation of the English one!)
The "ses" is apparently not differentiated for masc/fem (right?) so that makes sense. (I know that's true in Spanish and think it's true in French.)
Google has no "context" of any kind!

This is why it's a terrible idea to use Google to translate something blindly. With the right kind of preparation, it can work out ok. Like with John's translations, keeping simple, frequent phrasing works pretty well. It also tends to be reasonable about getting meaning out of what someone wrote, but not writing it well. Then again, it will miss the idioms as you pointed out. Personally I love using Google Translate to check basic things that I know are frequent. But I proofread the results and don't do large sections of text at a time-- usually just one word. It's a good way to remind yourself and check your instincts. It's a bad way to do it without any idea what it's doing.

It also does much better with syntax than morphology. So in a language like Swahili where "nitaokulana" is something like "I will eat it", the translator will often mess up significantly, giving something like word-for-word "mimi kula yeye" [me to.eat him], skipping things like tense. I don't really know why this is a problem, because a relatively simple frequency analysis of Swahili, given word-internal statistics, should lead to a relationship between the internal structure. It's probably just due to a bias of the algorithm for English, which can be fixed in the future.

Prendre ses jambes à son cou is a normal idiomatic expression of French. Google knows the expression, since if you enter it in its infinitive (non-finite) form, Google gives the correct translation (take to one's heels). This seems to suggest that Google treats it as a 'complex word' and takes it as a 'whole' rather than analyzing it. This would explain why changing something inside the infinitive expression (finite tense instead of infinitive) confuses Google.

But this does not explain why adding an agent to the infinitive expression gives an almost correct result if the agent is referred to by a noun (John prendre ses jambes à son cou --> John take to one's heels, should be: HIS heels) but something strange if the agent is referred to by an 'incorrect' pronoun (nous prendre ses jambes à son cou --> we take his legs around his neck).

So Google Translate has a lot to learn yet. Using Google Translate is a bit like using a calculator. If you don't understand arithmetics, you should not trust the calculator's results (typos!). If you don't know the language, you should be careful about Google's output. For the time being.

And for ever, I guess, since the interpretation (by humans) of more complex utterances is infinively more complicated than their literal meaning. The reason: interpreting linguistic utterances largely depends on knowledge of the world (pragmatics).

It's a little more subtle than that. In itself it really doesn't mean that. But it does in both of my invented contexts. In the first one 'cow' happens to be a slang term for a self indulgent woman. In the second, it's a literal cow. It's the word drop though that means both "to let fall" and "bring an end to". Literally, "let go of". You can drop out of school (leave completely) or simply drop a class if you're just feeling a little overburdened.

And this is how idioms get started, haha
Just like that one day back in 1503, when a guy walked into the street and saw a bunch of cats and dogs falling from the sky...

That happened in Yorkshire, England.

A similar strange event took place a couple of years earlier, in the province of Holland (Netherlands). Not only cats and dogs, but also cows, wolves, ravens and even fish fell from the sky. And icy winds tortured the plains...

There still exists an idiomatic expression in Dutch referring to that horror: beestenweer (literally: beast wheather = very bad wheather).

Of course, Google Translate is unaware of all of this. It translates beestenweer into beast again.