tag:theconversation.com,2011:/id/topics/google-translate-31887/articlesGoogle Translate – The Conversation2019-06-10T12:55:11Ztag:theconversation.com,2011:article/1168742019-06-10T12:55:11Z2019-06-10T12:55:11ZWhy people will beat machines in recognising speech for a long time yet<figure><img src="https://images.theconversation.com/files/278371/original/file-20190606-98033-nax9u2.jpg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=496&amp;fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/woman-talking-alphabet-letters-coming-out-332951597?src=JmJwhI9-qaIbaT-ymkdDmQ-1-41">Pathdoc/Shutterstock.cim</a></span></figcaption></figure><p>Imagine a world in which Siri always understands you, Google Translate works perfectly, and the two of them create something akin to a Doctor Who style translation circuit. Imagine being able to communicate freely wherever you go (not having to mutter in school French to your Parisian waiter). It’s an attractive, but still distant prospect. One of the bottlenecks in moving this reality forward is variation in language, especially spoken language. Technology cannot quite cope with it. </p>
<p>Humans, on the other hand, are amazingly good at dealing with variations in language. We are so good, in fact, that we really take note when things occasionally break down. When I visited New Zealand, I thought for a while that people were calling me “pet”, a Newcastle-like term of endearment. They were, in fact, just saying my name, Pat. My aha moment happened in a coffee shop (“Flat white for pet!” gave me a pause). </p>
<p>This story illustrates how different accents of English have slightly different vowels – a well-known fact. But let’s try to understand what happened when I misheard the Kiwi pronunciation of Pat as pet. There is a certain range of sounds that we associate with vowels, like <em>a</em> or <em>e</em>. These ranges are not absolute. Rather, their boundaries vary, for instance between different accents. When listeners fail to adjust for this, as I did in this case, the mapping of sound to meaning can be distorted. </p>
<p>One could, laboriously, teach different accents to a speech recognition system, but accent variation is just the tip of the iceberg. Vowel sounds can also vary depending on our age, gender, social class, ethnicity, sexual orientation, level of intoxication, how fast we are talking, whom we are talking to, whether or not we are in a noisy environment … the list just goes on, and on. </p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/orD-e_W6Pic?wmode=transparent&amp;start=0" frameborder="0" allowfullscreen></iframe>
</figure>
<h2>The crux/crooks of the matter</h2>
<p>Consider that a <a href="https://www.research.manchester.ac.uk/portal/files/98762868/strut_foot_paper_final.pdf">recent study</a> I was involved in showed that even moving house (or not) can affect one’s vowels. Specifically, there is a correlation between how speakers of Northern English pronounce the vowel in words like <em>crux</em>, and how many times they have moved in the last decade. People who have not moved at all are more likely to pronounce <em>crux</em> the same as <em>crooks</em>, which is the traditional Northern English pronunciation. But those who have moved four times or more are more likely to have different vowels in the two words, similarly in the south of England. </p>
<p>There is, of course, nothing about the act of moving that causes this. But moving house multiple times is correlated with other lifestyle factors, for instance interacting with more people, including people with different accents, which might influence the way we speak.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip" srcset="https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=480&amp;fit=crop&amp;dpr=1 600w, https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=480&amp;fit=crop&amp;dpr=2 1200w, https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=480&amp;fit=crop&amp;dpr=3 1800w, https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=603&amp;fit=crop&amp;dpr=1 754w, https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=603&amp;fit=crop&amp;dpr=2 1508w, https://images.theconversation.com/files/277383/original/file-20190531-69075-1o7hhtr.jpeg?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=603&amp;fit=crop&amp;dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Level of overlap between ‘crooks’ and ‘crux’ vowel categories, depending on the number of house moves in the last decade. Data from 143 speakers from the north of England.</span>
</figcaption>
</figure>
<p>Other sources of variation may have to do with linguistic factors, such as word structure. A striking example comes from pairs of words such as <em>ruler</em>, meaning “measuring device” and <em>ruler</em>, meaning “leader”.</p>
<p>These two words are superficially identical, but they differ at a deeper structural level. A <em>rul-er</em> is someone who rules, just like a <em>sing-er</em> is someone who sings, so we can analyse these words as consisting of two meaningful units. In contrast, <em>ruler</em> meaning “measuring device” cannot be decomposed further. </p>
<p>It turns out that the two meanings of <em>ruler</em> are associated with a different vowel for many speakers of Southern British English, and the difference between the two words has increased in recent years: it is larger for younger speakers than it is for older speakers. So both hidden linguistic structure and speaker age can affect the way we pronounce certain vowels.</p>
<h2>End never in sight</h2>
<p>This illustrates another important property of language variation: it keeps changing. Language researchers therefore constantly have to review their understanding of variation, which in turn requires continuing to acquire new data, and updating the analysis. The way we do this in linguistics is being revolutionised by new technologies, advances in instrumental data analysis, and the ubiquity of recording equipment (in 2018, <a href="https://en.wikipedia.org/wiki/List_of_countries_by_smartphone_penetration">82%</a> of the UK adult population owned a recording device, otherwise known as a smartphone). </p>
<p>Modern day linguistic projects can profit from the technological advancement in various ways. For instance, the <a href="http://englishdialectapp.com/">English Dialects App</a> collects recordings remotely via smartphones, to build a large and constantly updating corpus of modern day English accents. That corpus is the source of the finding concerning the vowel in <em>crux</em> in Northern English, for example. Accumulating information from this and many other projects allows us to track variation with increased coverage, and to build ever more accurate models predicting the realisation of individual sounds. </p>
<p>Can this newly refined linguistic understanding also improve speech recognition technology? Perhaps, but in order to improve, the technology needs to know a lot more about you.</p><img src="https://counter.theconversation.com/content/116874/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Patrycja Strycharczuk received funding from The British Academy.
The research discussed in this article features in an exhibit at the British Academy&#39;s 2019 Summer Showcase.</span></em></p>Having problems with Siri and Google Translate? Here's why.Patrycja Strycharczuk, Lecturer in Linguistics & Quantitative Methods, University of ManchesterLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/871362017-11-14T17:08:20Z2017-11-14T17:08:20ZExplainer: how the latest earphones translate languages<figure><img src="https://images.theconversation.com/files/194638/original/file-20171114-26470-u1yic9.jpg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=496&amp;fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/woman-holds-her-hand-near-ear-454016317?src=gnBoj8EPd-JVtMfNYi_MOA-1-2">Shutterstock</a></span></figcaption></figure><p>In the <a href="http://www.bbc.co.uk/programmes/b03v379k">Hitchhiker’s Guide to The Galaxy</a>, Douglas Adams’s seminal 1978 BBC broadcast (then book, feature film and now cultural icon), one of the many technology predictions was the <a href="http://www.bbc.co.uk/cult/hitchhikers/guide/babelfish.shtml">Babel Fish</a>. This tiny yellow life-form, inserted into the human ear and fed by brain energy, was able to translate to and from any language.</p>
<p>Web giant Google have now seemingly <a href="http://www.telegraph.co.uk/technology/2017/10/04/googles-new-headphones-can-translate-foreign-languages-real/">developed their own version</a> of the Babel Fish, called Pixel Buds. These wireless earbuds make use of <a href="https://assistant.google.com/">Google Assistant</a>, a smart application which can speak to, understand and assist the wearer. One of the headline abilities is support for Google Translate which is said to be able to translate up to 40 different languages. Impressive technology for under US$200.</p>
<p>So how does it work?</p>
<p>Real-time speech translation consists of a chain of several distinct technologies – each of which have experienced rapid degrees of improvement over recent years. The chain, from input to output, goes like this:</p>
<ol>
<li><p><strong>Input conditioning</strong>: the earbuds pick up background noise and interference, effectively recording a mixture of the users’ voice and other sounds. “<a href="http://acousticsresearchcentre.no/speech-enhancement-with-deep-learning">Denoising</a>” removes background sounds while a <a href="https://link.springer.com/article/10.1186/s13634-015-0277-z#Sec5">voice activity detector</a> (VAD) is used to turn the system on only when the correct person is speaking (and not someone standing behind you in a queue saying “OK Google” very loudly). Touch control is used to improve the VAD accuracy.</p></li>
<li><p><strong>Language identification (LID)</strong>: this system uses machine learning to identify what <a href="https://doi.org/10.1109/TASLP.2017.2766023">language is being spoken</a> within a couple of seconds. This is important because everything that follows is language specific. For language identification, phonetic characteristics alone are insufficient to distinguish languages (languages pairs like Ukrainian and Russian, Urdu and Hindi are virtually identical in their units of sound, or “phonemes”), so completely new acoustic representations <a href="https://pdfs.semanticscholar.org/8665/8be322dfb3d2a0fa5262b095ba6c5a6c31a2.pdf">had to be developed</a>.</p></li>
<li><p><strong>Automatic speech recognition (ASR)</strong>: <a href="http://www.cs.columbia.edu/%7Emcollins/6864/slides/asr.pdf">ASR</a> uses an acoustic model to convert the recorded speech into a string of phonemes and then language modelling is used to convert the phonetic information into words. By using the rules of spoken grammar, context, probability and a pronunciation dictionary, ASR systems fill in gaps of missing information and correct mistakenly recognised phonemes to infer a textual representation of what the speaker said.</p></li>
<li><p><strong>Natural language processing</strong>: <a href="https://blog.algorithmia.com/introduction-natural-language-processing-nlp">NLP</a> performs machine translation from one language to another. This is not as simple as substituting nouns and verbs, but includes <a href="https://codeburst.io/a-guide-to-nlp-a-confluence-of-ai-and-linguistics-2786c56c0749">decoding the <em>meaning</em> of the input speech</a>, and then re-encoding that meaning as output speech in a different language - with all the nuances and complexities that make second languages so hard for us to learn.</p></li>
<li><p><strong>Speech synthesis</strong> or text-to-speech (TTS): almost the opposite of ASR, this synthesises natural sounding speech from a string of words (or phonetic information). Older systems used additive synthesis, which effectively meant joining together lots of short recordings of someone speaking different phonemes into the correct sequence. More modern systems use <a href="http://www.cstr.ed.ac.uk/downloads/publications/2010/king_hmm_tutorial.pdf">complex statistical speech models</a> to recreate a natural sounding voice.</p></li>
</ol>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/dZojo2yxzVA?wmode=transparent&amp;start=0" frameborder="0" allowfullscreen></iframe>
</figure>
<h2>Putting it all together</h2>
<p>So now we have the five blocks of technology in the chain, let’s see how the system would work in practice to translate between languages such as Chinese and English.</p>
<p>Once ready to translate, the earbuds first record an utterance, using a VAD to identify when the speech starts and ends. Background noise can be partially removed within the earbuds themselves, or once the recording has been transferred by Bluetooth to a smartphone. It is then compressed to occupy a much smaller amount of data, then conveyed over WiFi, 3G or 4G to Google’s speech servers. </p>
<p>Google’s servers, operating as a cloud, will accept the recording, decompress it, and use LID technology to determine whether the speech is in Chinese or in English.</p>
<p>The speech will then be passed to an ASR system for Chinese, then to an NLP machine translator setup to map from Chinese to English. The output of this will finally be sent to TTS software for English, producing a compressed recording of the output. This is sent back in the reverse direction to be replayed through the earbuds.</p>
<p>This might seem like a lot of stages of communication, but it takes <a href="https://www.youtube.com/watch?v=dZojo2yxzVA">just seconds to happen</a>. And it is necessary – firstly, because the processor in the earbuds is not powerful enough to do translation by itself, and secondly because their memory storage is insufficient to contain the language and acoustics models. Even if a powerful enough processor with enough memory could be squeezed in to the earbuds, the complex computer processing would deplete the earbud batteries in a couple of seconds. </p>
<p>Furthermore, companies with these kind of products (Google, <a href="http://www.iflytek.com/en">iFlytek</a> and <a href="https://www.ibm.com/watson/services/language-translator">IBM</a>) rely on continuous improvement to correct, refine and improve their translation models. Updating a model is easy on their own cloud servers. It is much more difficult to do when installed in an earbud.</p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/6i5hho2aD-E?wmode=transparent&amp;start=0" frameborder="0" allowfullscreen></iframe>
</figure>
<p>The late Douglas Adams would surely have found the technology behind these real life translating machines amazing – which it is. But computer scientists and engineers will not stop here. The next wave of speech-enabled computing could even be inspired by another fictional device, such as Iron Man’s smart computer, <a href="https://futurism.com/this-new-ai-is-like-having-iron-mans-jarvis-living-on-your-wall">J.A.R.V.I.S</a> (Just Another Rather Very Intelligent System) from the Marvel series. This system would go way beyond translation, would be able to converse with us, understand what we are feeling and thinking, and anticipate our needs.</p><img src="https://counter.theconversation.com/content/87136/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Ian McLoughlin does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Time to chuck out the phrase books?Ian McLoughlin, Professor of Computing, Head of School (Medway), University of KentLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/853842017-10-15T19:20:25Z2017-10-15T19:20:25ZTranslation technology is useful, but should not replace learning languages<figure><img src="https://images.theconversation.com/files/189683/original/file-20171010-19989-v705qh.jpg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=496&amp;fit=clip" /><figcaption><span class="caption">The benefits of language-learning go far beyond being able to translate.</span> <span class="attribution"><span class="source">9to5google.com</span></span></figcaption></figure><p>For many years now, there have been calls for Australians to learn languages, particularly Asian languages, as the world economy pivots to the Asia-Pacific. But the number of students learning languages in Australia has <a href="https://docs.education.gov.au/system/files/doc/other/senior_secondary_languages_education_research_project_final.pdf">remained stubbornly low</a>.</p>
<p>Rapid improvements in machine translation and speech recognition technologies in recent years appear to offer an easy way out. While problems still arise, the <a href="https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html">use of AI</a> has led to remarkable improvements in the quality of Google translations, and increasingly accurate speech recognition technologies are now widely available.</p>
<p>Drawing these two technologies together, <a href="http://www.telegraph.co.uk/technology/2017/10/04/googles-new-headphones-can-translate-foreign-languages-real/">Google has announced</a> the upcoming launch of wireless headphones that feature real-time language translation. With the advent of these technologies, do we still need to learn other languages?</p>
<p>These are exciting times. Technological advancements are enabling us to communicate with people all around the world without needing to have a common language. So I can now use such devices to communicate with speakers of Chinese, Hungarian or Hindi. They will hear my speech translated into their language; they can speak back to me in their language; and I will have this translated into English for me in real-time. But the ability to communicate in one’s own language with speakers of other languages, without having any knowledge of how languages work or cross-cultural differences in the ways we communicate, opens up a can of worms.</p>
<p>For a start, linguists tell us that word meanings don’t always match across different languages. What goes in is sometimes not the same as what comes out. </p>
<p>US President Donald Trump’s attempts to communicate with the wider world are instructive in that respect. Earlier in the year, it was reported that Trump told a European Commission meeting that <a href="https://www.washingtonpost.com/world/trumps-alleged-slight-against-germans-generates-confusion-backlash/2017/05/26/0325255a-4219-11e7-b29f-f40ffced2ddb_story.html?utm_term=.488cb11fa6ec">“the Germans are bad, very bad”</a>. This caused much consternation in the German media, as they debated whether Trump meant the <a href="http://www.spiegel.de/politik/ausland/donald-trump-bei-der-eu-die-deutschen-sind-boese-sehr-boese-a-1149282.html">Germans are “böse”</a> (which has connotations of evil, malicious intent) or “schlecht” (meaning they are not doing the right thing). </p>
<p>This is not simply a problem of mistranslation. The point is that word meanings don’t match up precisely across languages. This has important implications for international business and relations, but it’s something that is masked if we take translations at face value. </p>
<p>Another basic feature of communication is that we generally mean much more than we say. Although language plays an important role in communication, very often what is implied or left unsaid is more important than what is said. These inferences are not easily managed in machine translation, because they differ across speakers and cultures. </p>
<p>A good example of this is that it’s common among speakers of (Mandarin) Chinese to first refuse an offer of food when visting someone’s house, especially if one isn’t that close. Such refusals are a way of testing the waters as to whether the offer is genuine. Accepting an offer too quickly may also be regarded as impolite. Offers and refusals are therefore often repeated before guests finally accept.</p>
<p>The point is that different cultures prefer different ways of speaking, and that means we do things through languages in different ways. These different ways of speaking give rise to different inferences depending on the language in question. </p>
<p>New technologies will no doubt change how which we approach the learning of languages in exciting ways, just as the way we learn maths changed when calculators became readily available. But we can’t outsource deep cross-linguistic and cross-cultural knowledge to apps, and the need to learn languages hasn’t changed.</p>
<p>Indeed, it seems to be on the rise as we enter an increasingly globalised economy. According to a report on “<a href="https://www.fya.org.au/wp-content/uploads/2016/04/The-New-Basics_Update_Web.pdf">The New Work Order</a>” by The Foundation for Young Australians, advertisements for jobs requiring bilingual skills grew by 181% from 2012 to 2015. And a report by the Institute for the Future in Palo Alto, California, on “<a href="http://www.iftf.org/uploads/media/SR-1382A_UPRI_future_work_skills_sm.pdf">Future Work Skills 2020</a>” identifies cross-cultural competency as one of ten key skills for the future workforce. </p>
<p>Learning languages allows us to experience different ways of thinking. It enables us to develop the ability to change our perspective on what is going on in any particular interaction, and to adapt ourselves to the mindsets of others. It also helps us to understand ourselves better and our own mindsets. Real cross-cultural understanding helps us build deeper relationships. </p>
<p>No matter what advances we make in machine translation or speech recognition, technology cannot change the fundamental nature of human languages and their role in communication. While such technologies are an increasingly useful tool, they can no more replace the deep cross-cultural knowledge that comes with learning languages than the advent of calculators meant we no longer needed to learn maths.</p>
<p>However, such technologies are now widely accessible. The upshot is that developing an awareness of differences between languages and the ways in which they underpin key cross-cultural differences is something that every Australian will have to develop. Rather than making the need for learning languages redundant, we are in fact entering a world in which awareness of differences across languages and cross-cultural competence is a must for all.</p><img src="https://counter.theconversation.com/content/85384/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Michael Haugh has previously received funding from the Australian Research Council and the Chiang Ching-kuo Foundation. He is affiliated with the School of Languages and Cultures at The University of Queensland. </span></em></p>Does the upcoming launch of wireless headphones by Google that feature real-time language translation mean we don't need to study other languages anymore?Michael Haugh, Professor of Linguistics and Head of School of Languages and Cultures, The University of QueenslandLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/663702016-10-04T04:19:37Z2016-10-04T04:19:37ZHas auto-translation software finally stopped being so useless?<figure><img src="https://images.theconversation.com/files/140214/original/image-20161004-20196-1etm04p.jpg?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=496&amp;fit=clip" /><figcaption><span class="caption">Will our digital phrasebook finally be able to handle more than just simple snippets?</span> <span class="attribution"><span class="source">cybrain/Shutterstock.com</span></span></figcaption></figure><p>If you’ve ever put a phrase into an online translator and then laughed at the garbled results, your fun might be coming to an end. Google claimed last week to have <a href="http://qz.com/792621/googles-new-ai-powered-translation-tool-is-nearly-as-good-as-a-human-translator">eradicated 80% of the errors made by its translation software</a>.</p>
<p>Translating text from one language into another is a simple proposition but a fiendishly complicated problem. Of course, it has traditionally been a job for human translators, but over the past half-century or so automated machine translation has become an important sub-field of artificial intelligence. </p>
<p>Auto-translation systems, including <a href="https://translate.google.com/">Google Translate</a>, were already pretty good at translating single words or even short sentences. But people are well aware of the limitations of this technology when it comes to translating longer, more complex passages, and hence are cautious about relying on them for important tasks. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip"><img alt="" src="https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip" srcset="https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=167&amp;fit=crop&amp;dpr=1 600w, https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=167&amp;fit=crop&amp;dpr=2 1200w, https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=167&amp;fit=crop&amp;dpr=3 1800w, https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=210&amp;fit=crop&amp;dpr=1 754w, https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=210&amp;fit=crop&amp;dpr=2 1508w, https://images.theconversation.com/files/140222/original/image-20161004-27269-g84dxs.png?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=210&amp;fit=crop&amp;dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Google Translate: good for short phrases, and about to get better at long ones too.</span>
<span class="attribution"><span class="source">Google Translate</span></span>
</figcaption>
</figure>
<p>Machine translation systems work by analysing an input text from one language and creating an equivalent representation in the target language. This can be as simple as word substitution, but such a system cannot guarantee high-quality output. That’s because it is difficult to program a computer to understand the text as humans do, and then to translate it to another language while keeping the meaning and semantics intact. </p>
<p>This is partly because different languages originated at different times and have different evolutionary histories. This gives each language a set of unique subtleties that can be difficult for humans to learn, let alone a computer attempting to go from simple word substitutions to intelligible sentences.</p>
<p>Not all words in one language have a direct equivalent in another, so several words might be needed to convey the meaning of a single word in the original language (a classic example being the German <em>schadenfreude</em>). The grammatical structure can also be different. Not all languages use the same subject-verb-object format found in most English phrases. </p>
<p>Auto-translation software can also struggle with words that have different definitions depending on their context. This means that the program will need to analyse the entire sentence or paragraph as a whole to deduce what it means. </p>
<p>Clearly, understanding the broader meaning is critical for producing useful translations. But teaching a machine to derive the subtle meanings of language is no easy task, as the often comical results of older translation software make clear. </p>
<p>Human translators rely on knowledge, experience and common sense, but we don’t really know precisely what is going on when the brain synthesises language. If we don’t know how it really works, how do we go about teaching a computer to do it?</p>
<h2>The machine learning approach</h2>
<p>As described above, the real challenge lies in moving beyond individual words or short phrases to translating large pieces of text such as entire websites or novels.</p>
<p>At the simpler end of the spectrum the technology already does a pretty good job. If you’re travelling in a foreign country you can use an augmented reality app such as <a href="http://www.forbes.com/sites/amitchowdhry/2015/07/30/google-translates-word-lens-feature-now-supports-27-languages/#5e893afe750f">Word Lens</a> to decipher street signs in real time. Simple tourist phrases are easily conjured up using programs that have simple language rules hard-coded into their programming.</p>
<p>But say you want to read a novel, or browse a foreign-language website, or translate a PowerPoint presentation in real time at a conference. This needs a new approach – one that recognises and reproduces the flow and meaning of the whole.</p>
<p>Google’s new approach involves what it calls “Neural Machine Translation (NMT)”. It relies on an artificial neural network which attempts to simulate the human brain’s approach to translation. Crucially, it can “learn” as it becomes more experienced, gradually improving its accuracy as it translates more text. </p>
<p>As NMT algorithms do not rely on human logic (that is, hand-coded algorithms), they can modify themselves as they go. In theory, they should be able to find ways to translate text that the human coders might not have conceived when designing the system.</p>
<h2>The future</h2>
<p>Reaching 100% accuracy will not be easy, but we can expect tech companies like Google to devote a lot of energy to trying. It is likely to be an evolutionary process, not a one-off breakthrough, and it will take huge amounts of time, data and processing power to improve the results until they are effectively flawless. </p>
<p>The latest development nevertheless represents a huge step forward – finally propelling machine translation to a standard that is acceptable for most tasks. For now, if you need 100% accuracy you will still need to hire a human translator, but with every day that passes computers are honing their skills. </p>
<p>This raises the question of how seamlessly auto-translation will become a part of our everyday experience in the future. In time, we may browse websites that automatically open up in our preferred language based on our profile, or listen to lectures in whatever language we choose, or engage in real-time discussions with people speaking a different language without having a human translator listening in. The opportunities are limitless. </p>
<p>If we can improve the accuracy to almost 100%, language barriers will begin to disappear. We would belong to one global village, where anyone can share their knowledge and expertise with anyone else. </p>
<p>In a world where computers are multilingual, will anyone need to bother learning another language? It’s too early to say. But just as mapping software has all but eradicated the feeling of being lost in a strange place, we’re heading for a world where you can be anywhere on the planet and never be lost for words.</p><img src="https://counter.theconversation.com/content/66370/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Vidyasagar Potdar does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Auto-translation software has been pretty frustrating to use. But news of vast improvements to Google's translation software raises the prospect that websites will soon be browsable in any language.Vidyasagar Potdar, Senior Research Fellow, School of Information Systems, Curtin UniversityLicensed as Creative Commons – attribution, no derivatives.