Translation - Man vs Machine

EVERYBODY has his own tale of terrible translation to tell — an incomprehensible restaurant menu in Croatia, a comically illiterate warning sign on a French beach. “Human-engineered” translation is just
as inadequate in more important domains. In our courts and hospitals,
in the military and security services, underpaid and overworked
translators make muddles out of millions of vital interactions. Machine
translation can certainly help in these cases. Its legendary bloopers
are often no worse than the errors made by hard-pressed humans.

Machine translation has proved helpful in more urgent situations as well. When Haiti was devastated by an earthquake in January, aid teams poured in
to the shattered island, speaking dozens of languages — but not Haitian
Creole. How could a trapped survivor with a cellphone get usable
information to rescuers? If he had to wait for a Chinese or Turkish or
an English interpreter to turn up he might be dead before being
understood. Carnegie Mellon University instantly released its Haitian
Creole spoken and text data, and a network of volunteer developers
produced a rough-and-ready machine translation system for Haitian
Creole in little more than a long weekend. It didn’t produce prose of
great beauty. But it worked.

The advantages and disadvantages of machine translation have been the subject of increasing debate among human translators lately because of the growing strides made in the
last year by the newest major entrant in the field, Google Translate.
But this debate actually began with the birth of machine translation
itself.

The need for crude machine translation goes back to the start of the cold war. The United States decided it had to scan every scrap of Russian coming out of the Soviet Union, and there just weren’t
enough translators to keep up (just as there aren’t enough now to
translate all the languages that the United States wants to monitor).
The cold war coincided with the invention of computers, and “cracking
Russian” was one of the first tasks these machines were set.

The father of machine translation, William Weaver, chose to regard Russian as a “code” obscuring the real meaning of the text. His team and its
successors here and in Europe proceeded in a commonsensical way: a
natural language, they reckoned, is made of a lexicon (a set of words)
and a grammar (a set of rules). If you could get the lexicons of two
languages inside the machine (fairly easy) and also give it the whole
set of rules by which humans construct meaningful combinations of words
in the two languages (a more dubious proposition), then the machine
would be able translate from one “code” into another.

Academic linguists of the era, Noam Chomsky chief among them, also viewed a language as a lexicon and a grammar, able to generate infinitely many
different sentences out of a finite set of rules. But as the
anti-Chomsky linguists at Oxford commented at the time, there are also
infinitely many motor cars that can come out of a British auto plant,
each one having something different wrong with it. Over the next four
decades, machine translation achieved many useful results, but, like
the British auto industry, it fell far short of the hopes of the 1950s.

Now we have a beast of a different kind. Google Translate is a statistical machine translation system, which means that it doesn’t try to unpick
or understand anything. Instead of taking a sentence to pieces and then
rebuilding it in the “target” tongue as the older machine translators
do, Google Translate looks for similar sentences in already translated
texts somewhere out there on the Web. Having found the most likely
existing match through an incredibly clever and speedy statistical
reckoning device, Google Translate coughs it up, raw or, if necessary,
lightly cooked. That’s how it simulates — but only simulates — what we
suppose goes on in a translator’s head.

Google Translate, which can so far handle 52 languages, sidesteps the linguists’ theoretical question of what language is and how it works in the human brain. In
practice, languages are used to say the same things over and over
again. For maybe 95 percent of all utterances, Google’s electronic
magpie is a fabulous tool. But there are two important limitations that
users of this or any other statistical machine translation system need
to understand. Cont....