Google Translate is easily abused

Google Translate is an awesome tool. I’m particularly impressed by how well it works for translating to and from Finnish, because Finnish has generally been considered a rather difficult language for machine translators to handle. Google’s translations are definitely not perfect, but they’re fairly legible.

It’s just that no matter how good it is, it’s still definitely not a substitute for real translators. One of the odd qualities of Google Translate seems to be that I’ve seen people translate some English rubbish to Finnish, and then back, and they think they get pretty good result. (Happened somewhere in this thread in Reddit and also in comments of this blog post in Good Show Sir - comments 9 and 12).

Here’s the thing to consider: If you are unsure of the results you get from some sort of tool, don’t use the same tool to verify the results. Google Translate can produce garbage using its complicated set of rules. Then it can use the exact same rules to produce the original text you fed to it. If the translation back seems to look good to you, it’s just because it was what the text looked like when you fed it into it. This two-directional translation doesn’t prove that Google Translate translated the text correctly in the first place; it just proves that the rules it has for translating Finnish happen to work both ways.

Automatic translation is a tool that is best used for trying to comprehend text that is in a language you don’t speak. For this purpose, Google Translate works excellently. It does not, however, work at all in situations where you need to communicate the text to someone else.

Take a look at this automatically translated article from Finnish Wikipedia. Wikipedia is a good example, because it’s a well known site. Google Translate does a fairly admirable work in this article. English speakers can easily deduct that according to the Finnish Wikipedia, guard dogs can bark at intruders or even attack them.

Finnish written language doesn’t have grammatical articles as such, so the translation has to make a few guesses - and English speakers have to make a few guesses in case the thing doesn’t work as expected. For example, the article includes the phrase “Some breeds are strong enough to drive up to the wolf away”, which is translated from “Jotkut rodut ovat riittävän voimakkaita ajamaan jopa suden pois.” Now, English speakers may be a bit puzzled by the definite article in expression like “drive the wolf away”. Which wolf is “the wolf”? It was not mentioned in the preceding text, so do the English speakers have to make the assumption that there’s a Wolf in the Finnish wilderness - a specific mighty feral creature of the wild, against which the strength of all dog breeds is measured? No, that would be silly. The actual phrase just says “Some breeds are even strong enough to drive a wolf away.” Also note the clumsy part, “drive up to”; it’s fairly clear that Google Translate is more geared for translating driving instructions than describing some canine dynamics.

But all in all: This sort of phrases make it clear that the translation is legible for normal people. You need to interpret the text a little bit, but you can figure it out. Mostly.

The crucial question is: Would you trust everyone else to do the same? Your brain may automatically make the clumsy parts fall in order, but would you expect everyone else to do the same? Imagine yourself being someone in Finland who doesn’t speak English at all - would you hesitate to present this translated article as an official translation?

No, presenting that article as an official translation would be silly. English Wikipedia article is not a machine translation from, say, German Wikipedia. It’s edited by people who actually understand English to some capacity.

Yet, at the same time, people are using Google Translate to bring their online stuff to the Finnish market. People who don’t speak Finnish are trying to make themselves sound like they speak Finnish. And they fail, because they’re fairly obviously using machine translation.

Don’t do this. Google Translation is good, but we can see through the ruse after reading through the stuff for more than a few moments.

Incidentally, this auto-translation thing is mostly being pushed by online advertisers. I see Google-translated advertisements for online and cell phone/SMS-based services. I haven’t investigated how these operations work, but suspect that the folks just set up a shop in some Finnish cell phone service and foist their stuff online - usually some sort of seemingly fun stuff like IQ/love/life expectancy/whatever tests. Gullible teenagers then sign up without reading the fine print that says “15€/month”. Their parents then have fun cancelling the thing.

And here’s the thing: The more this sort of auto-translated stuff I see, the more my brains say “this is a scam”. It tickles the same part of my brain that triggers banner-blindness. By using Google Translate as your Finnish-speaking “liaison”, you’re associating yourself with shady enterprises. I mean, legitimate business would afford the services of a human translator, right? You don’t want to appear shady, do you?

And it seems to me that one of the shady enterprises who uses Google Translate is… um… Google.

I don’t get it. Many parts of Google user interfaces are obviously human-translated. Google operates in Finland: they have a local branch and are planning to set up big-ass data centres here (or already have them - I don’t know how things have developed).

So why the hell is YouTube so crappily translated?

One of Google Translate’s good sides is that it usually gets the “interface” right. The interface links in Wikipedia auto-translation are almost identical to the interface links in English Wikipedia. You may notice small differences to the UI in English version, like “Debate” instead of “Discussion” and “Participation” instead of “Interaction” - but the UI is relatively comprehensible. YouTube, however, suffers from some odd problems. In individual channel pages, activity boxes read “Uutta toimintaa YouTubessa” (“New activity in YouTube”), which sounds fairly odd, because they report the individual user’s “new activity in YouTube” instead of “new activity in YouTube” as a whole. User profiles have entries like “Kanavan näyttökerrat” and “Latausten näyttökerrat yhteensä”; these may sound succinct in English (“Channel view amount” and “Upload view amount total”), but they’re overly verbose and they sound clumsy. Something like “Kävijöitä kanavasivulla” (“visitors on channel page”) and “Videoiden katselukertoja” (“[Number of] video views”) would sound more natural.

Comment edit boxes are particularly brainfarty: “500 merkkejä jäljellä” for “500 characters remaining” is just grammatically incorrect (“500 merkkiä järjellä” would be right); Comment submission button, inexplicably, reads “Teksti” (“Text”) instead of something informative, like, oh, “Lähetä” (“Submit”). Comment ratings are odd: “Äänestä paremmaksi” and “Äänestä huonommaksi” are comprehensible as “vote as better” and “vote as worse”, but something more succinct and less clumsy like “Hyvä kommentti” (“good comment”) and “Huono kommentti” (“bad comment”) might be preferable. The flag button says “Merkitse roskapostiksi”, which is a clear flawed translation for “Mark as spam”; the problem is that “roskaposti” means “spam email” or “junk mail”. YouTube comments are obviously not email. I don’t know if there’s a succinct term for comment spam; maybe something as “Merkitse roskaksi” (“Mark as trash/junk”) would be closer to right.

In summary: If you want to make an appearance of running a legitimate business on the Web, don’t use Google Translate as your official translation. It’s not a substitute for a human translator.