The rapidly evolving, dynamic marketplace today has created an enormous spike in the demand for Machine Translation (MT) in a number of industries. According to a new study by Grand View Research, the global Machine Translation market is expected to reach USD 983.3 million by 2022. This is a huge leap from 2014, when the MT market was valued at USD 331.7 million, and this growth projection mirrors a trend in the market. Thanks to globalization, there is an increased demand for cost efficiency in translation, but the amount of linguistic knowledge and time required for translating all the content for a particular business exceeds the capacity of human translation alone.

Key Insights into the Machine Translation Market

Some of the key insights about Machine Translation that the study discusses are summed up here:

Statistical Machine Translation (SMT) is a clear winner over Rules Based Machine Translation (RBMT), when it comes to the present market requirement.

Globalization and the need to address diverse cultural groups has led to the popularity of translation technology in Asia Pacific, thus opening up new potential markets for MT providers.

Machine translation as a service (MTSaaS) makes use of SMT and is accessible via the web. This allows users to customise their MT engines with their own Translation Memories (TMs).

What this means is that, deploying an integrated MT solution will become a critical success factor for gaining market share in the future.

Potential Challenges

The study reveals major challenges for the MT industry, which includes a lack of quality translations and Quality Estimation (QE) and competition from free translation service providers. Needless to say, a well-rounded back-end knowledge base, along with efficient NLP (Natural Language Processing) capabilities and a scalable model are critical to gaining competitive advantage in the market. The MT providers need to go above and beyond their role as simply providing machine translation services; they need to become solution providers.

How is KantanMT contributing to the MT market?

The KantanMT platform offers massive competitive advantage, not only because we were one of the first entrants in the MT market, but also because thanks to our strategic market insights, we have already identified most of these challenges and developed solutions to address them. As solution providers, we use an intuitive approach that can be summed up in a few words: speed, scalability, simplicity, and security.

Speed

In a market where new products and innumerable variants of those products are being developed almost every day, it is important to have on-demand translated content ready to be deployed. KantanMT helps its clients have the first leap advantage over their competitors by translating content on the fly.

KantanMT engines have the capacity to translate 114 million words in a single day and as of 7 September, 2015, we have exceeded 2 billion translated words, with 1 billion words being translated in the last two months itself.

KantanMT Platform statistics as of 8th September 2015

Scalability

As a business trying to make its mark in the global MT market, it is extremely important to have a solution that has limitless scalable potential. KantanMT engines with scaling technologies such as the KantanAutoScale are devised to ensure that no matter how sudden the spike in content is, the quality and volumes of translated content will never suffer.

The power of KantanMT’s engine was summed up Tony O’Dowd, Founder and Chief Architect of KantanMT.com,

“We are only starting to see the potential growth of the Machine Translation market, and I doubt any other player can operate at this scale as flawlessly.”

Simplicity

Simplicity is at the very core of KantanMT. The company name itself is derived from the Japanese word for simplicity 簡単 (かんたん). KantanMT strives to take the complexity out of the user interface, while powerful MT engines do all the hard work in the back end. Easy to understand analytics can be generated through the KantanMT engines to gather insights into improving engine quality and maintaining translation quality.

Security

Cloud based MT solutions have become the industry norm. However, security concerns are high – especially, if you are in the eCommerce industry or deal with legal information. KantanMT’s multilayered security approach protects and monitors translations ensuring all industry secrets are safe. Unlike a number of open source translating tools, you own the source as well as translated words.

Final words

One of the key findings of the Grand View Research review points out that “strategic joint ventures, coupled with mergers and acquisitions, (which) have been among the key strategies adopted” by major players in the Machine Translation industry. KantanMT recognises the importance of both industry and academic relationships in building a complete MT ecosystem.

As a team of people with an unbridled passion for innovations in the Machine Translation industry, Monday’s news about Reverie Technologies, a Bengaluru-based startup bagging a $4M investment did not come as much surprise to us. This brilliant news serves to highlight once again that in the ever-changing world of retail marketing and globalization, any business with plans to accelerate their products into global markets needs to localize their content for enhanced user experience. This goes on to drive global revenues and increase brand equity in existing and new markets. Continue reading →

Translation Machines in Sci-fi

In science fiction, translation of the potentially infinite number of languages spoken by alien species presents a dilemma. How to deal with communication between interplanetary species without resorting to contrivance, or spending the first twenty minutes of each episode’s dialogue clumsily showing characters learning one another’s diphthongs?

The notion of a ‘universal translator’ emanated from Murray Leinster’s novella First Contact, published in 1945 (and clearly that isn’t the only debt Gene Roddenberry owes to Leinster). It’s a greatly helpful – borderline miraculous, in fact – convention of sci-fi: a technological solution to the language barrier, leaving more time for the actual narrative to unfold in one language, typically English.

With the incredible advancements in technology we’re witnessing at the moment such as Microsoft’s pilots of a Skype Translator and the industry leading work KantanMT is achieving in this area, are we seeing the beginnings of live translation – well ahead of Star Trek’s 22nd century deadline? In the meantime, let’s take a look at five of sci-fi’s finest translation machines, which beat anything real-life technology can offer – for now.

1. Star Trek: Universal Translator

An important part of Star Trek’s near-utopian vision of the future is the Universal Translator. Translating any language into another even while a person is speaking, this exceptionally handy tool means Starfleet craft in any quadrant of the galaxy can speak to new life and new civilizations without confusion.

Voiced by Star Trek creator Roddenberry’s widow Majel Barrett until her death in 2008, the development of a universal translator was, in the Trek universe, a portent of Earth’s cultures achieving universal peace. It’s difficult to imagine Google Translate having the same impact.

This convenient concept has been often copied, and occasionally parodied: in Futurama, everyone in the universe speaks English, rendering Professor Farnworth’s one successful invention – a translation device – useless, as it merely translates English into the dead language, French!

2. The Hitchhikers’ Guide to the Galaxy: the Babel Fish

Some sci-fi plays with the concept in less serious ways. In Douglas Adams’ H2G2, to help Arthur Dent deal in some small way with anything that goes on around him, inserted into his ear is a Babel Fish, memorably described by the Guide as “small, yellow, leechlike and probably the oddest thing in the universe.”

The science (such as it is) behind the Babel Fish is that it can absorb the frequencies of outside speakers, and a translation is secreted by the fish into the hearer’s brain via his or her ear canal. In a witty reversal of Star Trek’s idealistic Federation, Adams reveals that, by allowing everyone to understand one another, the Babel Fish has actually caused more war than anything else in the universe.

3. Farscape: Translator microbes

In science fiction, as in reality, it is the individual idiosyncrasies of languages which are trickiest to master. When people in the UK from a hundred miles apart may speak different languages, not to mention a range of different dialects and accents, can auditory translation really be so smooth?

One series to acknowledge this is Farscape, where astronaut John Crichton is injected with bacteria-sized ‘translator microbes’, which are injected into – and colonise – his brain. The microbes work to make their host understand any spoken information in any language – except idioms are translated literally. This leads to a great deal of confusion for John, and opportunities for humour for the audience (all jokes are language, after all) – and also perhaps renders these microbes a more realistically-limited translator technology.

4. Doctor Who: The TARDIS’ Translation Circuit

As well as being telepathically linked with the Doctor, and granting the ability to travel to any time or place in history and the future, the TARDIS’ telepathic field is used to automatically translate what the Doctor and any companions hear or read into a language which they can understand.

While wonderfully convenient, the mind-meld involved does mean that the translation circuits won’t actually work when the Doctor is unconscious – not an outright impossibility. Also, because translations are time specific, ancient civilization won’t understand neologisms – and, neatly, the Romans have never heard the word ‘volcano’ – because they’ve not lived to see an eruption.

5. Star Wars: C-3PO

Luke Skywalker is the ultimate sci-fi everyman: he is every bit as much in need of a guide to the universe he finds himself in as the viewing audience are. Reinforcing this are his guides, C-3PO and R2D2, who Luke needs with him – despite their obvious drawbacks as travelling companions – because C-3PO is programmed with millions of languages, everything from Ewok to R2’s bleeps and whistles.

When the franchise returns with The Force Awakens later this year (which most fans will rightly consider the fourth, rather than seventh, Star Wars movie), C-3PO’s translation abilities are sure to make him at least partially useful to have around.

The KantanMT team say a big Thank You to Richard for a very savvy post on translation machines in science fiction.

It’s a fact, infiltrating new markets is the key to increasing profits, and the first item on any company’s internationalization checklist should be to make sure it communicates product information in a way its target customers can understand.

Leading on from the 2006 research, CSA’s updated survey in 2014 was based on a sample of three thousand global respondents, and it reinforced earlier results by showing that 55% only buy from websites in their native language. This jumped dramatically to 80% in cases where the buyers English language ability is limited.

When it comes to selling internationally, tapping into new revenue streams demands translated content. But, what happens when you have thousands of product descriptions that need to be localized into a plethora of languages?

This is where the fun begins for localization teams with well-established traditional translation workflows in place. Their existing method seems fine…but when it’s time to scale up, this is when cracks in the process begin to appear.

The translation workflow works best when it matches the scale and velocity for the content created whether it is product descriptions, manuals or online help documentation.

The challenging part –

How to translate product descriptions with velocity and to scale?

We have heard a great deal of arguments for and against machine translation and one of the most well known against arguments is “the quality is rubbish, sentences translated by machine translation are garbled and incomprehensible”. We in the language technology field hear this frequently and often shudder in disbelief at how these conclusions have been reached.

Generic or free machine translation systems in most cases do not produce great results, expecting such a system to produce publishable quality MT results or using it as benchmark for all MT systems is akin to extracting blood from a stone. Achieving good MT output takes time, care and the ability to customise the MT system properly.

Any company that is serious about breaking into international markets should also be serious about their MT strategy. They should be considering a customised MT solution that is tailored to their needs, not just by going for a cheap and/or supposedly free option.

Why is MT customisation so important?

Statistical machine translation is based on machine learning and pattern recognition. Segments with multiple word phrases or n-grams as they are known are identified with probability algorithms that select the most probable translation match. Generic or free MT systems typically have been built on a broad mix of content styles and types. This means it’s much harder for the MT system to identify the most likely or even relevant matches in generically built engines.

When the MT system is customised specifically for content that comes from a single domain, such as product descriptions for a specific categories e.g. Home and garden, fashion or electronic devices, the syntax, style and phraseology used will make sure that when an MT match is generated there will be a higher probability that the match will be closer to the desired output, resulting in a much more accurate translation.

How important is saving costs?

Of Course Machine Translation can save costs – if done properly, significant savings can be made. But, saving costs is often not the end goal for implementing a serious MT strategy. The real gains come from increasing productivity without a compromise in quality. Why translate 2000 words a day when you can machine translate and post-edit 8000 words with no loss of quality? Really it can be done! See an example first hand (Netthandelen’s case study PDF download).

When it comes to eCommerce and selling hundreds of products online the words to be translated are counted in billions not thousands, and without MT, traditional localization budgets would become more and more expensive, so MT is really the only practical solution. But, if MT is considered a way to save money by cutting corners then it is doomed to fail from the outset.

It will fail because it’s not sustainable, the effort and costs required to fix bad quality MT output are too great, and if fixing is neglected by publishing the content as is, it will result in angry customers who shop elsewhere – and they will, as the choice available now is greater than ever before!

Key takeaways

Generic free MT will not generate the same quality as customised MT

Investing in a robust MT strategy will save time, costs and headaches in the long run

Keep focus on communicating with the customer, in their language and your eCommerce business will thrive

Email louisei@kantanmt.com if you have questions or want to learn more about how Machine Translation works for product descriptions.

In today’s world, the path to profits comes from global expansion, and everyone in business wants profits. With the goal of increased profits in mind, it is logical that business professionals constantly keep an eye out for new ways to expand their customer base and increase their bottom line.

In many cases, the most effective way to reach new customers is to speak their language and what better way to do this than through the translation and localization of product content into the languages, which are spoken, understood and used by the target audience.

When content is static and only needs a one-off translation, then traditional translation workflows do the job just fine, but when the content is a continuous stream of product descriptions or online help/chat content, a real-time scalable translation solution is the only feasible solution.

Machine Translation (MT) is the real-time scalable solution and the key to opening up new markets, reaching new customers and increasing profits. It is a productivity tool in the content production workflow with the potential to boost a company’s economic performance. However, a word of caution…before reaping the economic benefits of including MT in content production there are some criteria that should be carefully considered before jumping in to use Machine Translation.

Join Tony O’Dowd Founder and Chief Architect of KantanMT and Alan Houser, Co-Founder and President of Group Wellesley, Inc. as they discuss the economic arguments in favour of including Machine Translation in content production workflows.

Webinar Date: Thursday July 16th 5PM IST (Dublin), 9AM U.S. west coast and 12PM US East Coast will last approximately one hour including a Q&A session.

January 2015 marks the last month of the Moses Core project. The project started three years ago in 2012, as a collaborative effort by its members to improve translation processes and to create a competitive translation environment. Over those three years, the translation and MT landscape has changed significantly. This change and the project’s success is in no small part due to the hard work and diligence of the Moses Core project coordinator; TAUS and with TAUS’s kind permission, KantanMT is republishing the MT use case for the KantanMT Community.

COMPANY NAME

KantanMT.com is a registered trademark of Xcelerator Machine Translations Ltd.

TIME IN MT BUSINESS

The platform was launched commercially in Q4 2013, however, we have been rigorously testing KantanMT.com in academic and commercial settings since 2012. In the beginning, the product was offered as a free trial to the KantanMT Community, and their feedback was instrumental in shaping and improving the platform to what it is today.

MOSES EXPERIENCE

The Moses technology has improved immensely over the past 12-18 months. Developer documentation and support materials, while initially very basic, have matured into a more structured, comprehensive and helpful resource. Additionally, the management of software distributions has made it easier to work with, understand and deploy. These are key elements in maintaining and supporting any open-source technology and have made Moses a key technology for the localization industry.

WHY MOSES?

The rise of the global economy and the driving demand for multilingual translation created a gap in the market for a sustainable translation method that could automatically scale to accommodate fluctuating translation needs. The KantanMT Development team was able to utilize the open source Moses decoder to develop a cloud-based Statistical Machine Translation (SMT) platform, where clients could build and manage their own customized MT engines without compromizing on the ownership of their data. The flexibility, scalability and security of the Moses toolkit made this possible.

The Moses toolkit offers the most flexibility in implementing an SMT solution for commercial purposes, as it allows the system’s training and decoding process to be modified. This has enabled the KantanMT team to create a high-value product that is dynamic and commercially relevant.

To ensure the product could scale and adapt to user needs the KantanMT team needed a decoder that could be built and managed on the cloud. The Moses system enabled this functionality.

Parallel language data is required to train an SMT engine. This data is an important resource for companies, and current generic SMT engines do not guarantee the security or safeguard the ownership of these assets. In using the Moses decoder, the KantanMT team created a product that could ensure its clients’ data was kept private, and not repurposed or reused in anyway.

Many global companies have large repositories of bilingual data, however, they often do not wish to deploy and maintain their own version of the Moses decoder. The KantanMT Development team was able to develop the sophisticated Moses SMT technology into a package that could be easily accessible to companies wishing to translate their content, and over time achieve localization cost savings.

MT STAFF

The current machine translation development team consists of four people, who maintain the platform and build machine translation engines for clients. Due to significant growth in the company over the past year, KantanMT.com will be hiring more staff over the course of the next few months to build engines for clients.

MT SYSTEM INFRASTRUCTURE

Insource or Outsource Moses/Implementation

Based on research, the demands of the language services industry and enterprise machine translation buyers, KantanMT has implemented and customized the Moses decoder in house to create a robust and commercially viable machine translation product that can scale and adapt to our clients’ needs. The original/base KantanAnalytics™ technology was co-developed with the CNGL Centre for Global Intelligent Content, an academic-industry research Centre based in Dublin City University, Ireland. However, all other KantanMT.com technologies have been developed in house by an in house expert development team.

Number of Engines

As of January 2015, the total number of MT engines built on KantanMT.com by the KantanMT community is 6,777 engines.

Volumes

As of January 2015, the total number of training words uploaded to the platform by the KantanMT Community has surpassed 50 billion, and the number of translated words on the platform is now more than 600 million.

USE SCENARIO

KantanMT.com Preferred MT Supplier

bmmt GmbH is a German language service provider with a strong focus on machine translation. It needed a Machine Translation provider, which would give the bmmt team full control of their Machine Translation training data and MT engine customization process at a low investment point. They also required a system which could correctly handle format-specific tagging and transparent transfer of mark-up information.

In early 2013, bmmt joined the KantanMT Community and began testing different customization processes using client specific training data. The team initially experienced minor problems with their SDLXLIFF files. However, the KantanMT development team were able to quickly solve this problem by restructuring some of its tokenizers.

The company began deploying production engines in mid-2013. These were showing particularly high Quality Evaluation (QE) scores due to the quality of their training data and resulted in a considerable increase in translation productivity. bmmt MT technicians found that domain specificity is a better basis for predictable output than sheer input size.

bmmt is currently using approximately 20 KantanMT engines in production across technical and automotive domains. These production ready engines are experiencing high quality metric scores for each language combination.

MARKET POSITIONING

KantanMT.com is one of the market leaders of cloud-based machine translation services. It provides cloud-based SMT services to major global enterprises and software companies wishing to translate large volumes of data. It works directly with companies to develop and implement a long term machine translation strategy, or it works with a select number of language service providers (preferred MT supplier partner program) to supply MT services to large enterprises.

VIEWS ON CURRENT STATE OF MT

Machine translation is now much more widely accepted in the industry, than it was just a few years ago. Since KantanMT.com entered the market in its testing phase in 2012, we have seen an enormous change in the attitudes and perception of MT in the language community. Access to technology such as smart-phones and tablets in non-English speaking nations has driven the global marketplace, and this in turn has increased the need for on-demand translation services – driving demand for MT services. The MosesCore Project has facilitated this demand with an open source solution that made it possible for smaller companies, and startups like us to compete against bigger MT providers, to solve the problem of language.

“The KantanMT platform sets a new industry benchmark in terms of analytics and development tools used to build and measure the quality of Statistical MT Engines. The KantanMT expert development team has introduced some of the industry’s most exciting and valuable technologies built on the Moses decoder, which are helping language and enterprise clients to translate more efficiently and reduce costs.” KantanMT.com founder and Chief Architect, Tony O’Dowd.

As a member of the KantanMT preferred partner program, bmmt works closely with KantanMT to provide MT services to its clients, which include major players in the automotive industry. KantanMT was able to catch up with Maxim Khalilov, technical lead and ‘MT guru’ to find out more about his take on the industry and what advice he could give to translation buyers planning to invest in MT.

KantanMT: Can you tell me a little about yourself and, how you got involved in the industry?

Maxim Khalilov: It was a long and exciting journey. Many years ago, I graduated from the Technical University in Russia with a major in computer science and economics. After graduating, I worked as a researcher for a couple of years in the sustainable energy field. But, even then I knew I still wanted to come back to IT Industry.

In 2005, I started a PhD at Universitat Politecnica de Catalunya(UPC) with a focus on Statistical Machine Translation, which was a very new topic back then. By 2009, after successfully defending my thesis, I moved to Amsterdam where I worked as a post-doctoral researcher at the University of Amsterdam and later as a RD manager at TAUS.

Since February 2014, I’ve been a team lead at bmmt GmbH, which is a German LSP with strong focus on machine translation.

I think my previous experience helped me to develop a deep understanding of the MT industry from both academic and technical perspectives. It also gave me a combination of research and management experience in industry and academia, which I am applying by building a successful MT business at bmmt.

KMT: As a successful entrepreneur, what were the three greatest industry challenges you faced this year?

MK: This year has been a challenging one for us from both technical and management perspectives. We started to build an MT infrastructure around MOSES practically from scratch. MOSES was developed by academia and for academic use, and because of this we immediately noticed that many industrial challenges had not yet been addressed by MOSES developers.

The first challenge we faced was that the standard solution does not offer a solid tag processing mechanism – we had to invest into a customization of the MOSES code to make it compatible with what we wanted to achieve.

The second challenge we faced was that many players in the MT market are constantly talking about the lack of reliable, quick and cheap quality evaluation metrics. BLEU-like scores unfortunately are not always applicable for real world projects. Even if they are useful when comparing different iterations of the same engines, they are not useful for cross language or cross client comparison.

Interestingly, the third problem has a psychological nature; Post-Editors are not always happy to post edit MT output for many reasons, including of course the quality of MT. However, in many situations the problem is that MT post-editing requires a different skillset in comparison with ‘normal’ translation and it will take time before translators adopt fully to post editing tasks.

KMT: Do you believe MT has a say in the future, and what is your view on its development in global markets?

MK: Of course, MT will have a big say in the language services future. We can see now that the MT market is expanding quickly as more and more companies are adopting a combination TM-MT-PE framework as their primary localization solution.

“At the same time, users should not forget that MT has its clear niche”

I don’t think a machine will be ever able to translate poetry, for example, but at the same time it does not need to – MT has proved to be more than useful for the translation of technical documentation, marketing material and other content which represents more than 90% of the daily translators load worldwide.

Looking at the near future I see that the integration of MT and other cross language technologies with Big Data technologies will open new horizons for Big Data making it a really global technology.

KMT: How has MT affected or changed your business models?

MK:Our business model is built around MT; it allows us to deliver translations to our customers quicker and cheaper than without MT, while at the same time preserving the same level of quality and guaranteeing data security. We not only position MT as a competitive advantage when it comes to translation, but also as a base technology for future services. My personal belief, which is shared by other bmmt employees is that MT is a key technology that will make our world different – where translation is available on demand, when and where consumers need it, at a fair price and at its expected quality.

KMT: What advice can you give to translation buyers, interested in machine translation?

MK: MT is still a relatively new technology, but at the same time there is already a number of best practices available for new and existing players in the MT market. In my opinion, the four key points for translation buyers to remember when thinking about adopting machine translation are:

Don’t mix it up with TM – While TMs mostly support human translators storing previously translated segments, MT translates complete sentences in an automatic way, the main difference is in these new words and phrases, which are not stored in a TM database.

There is more than one way to use MT – MT is flexible, it can be a productivity tool that enables translators to deliver translations faster with the same quality as in the standard translation framework. Or MT can be used for ‘gisting’ without post-editing at all – something that many translation buyers forget about, but, which can be useful in many business scenarios. A good example of this type of scenario is in the integration of MT into chat widgets for real-time translation.

Don’t worry about quality – Quality Assurance is always included in the translation pipeline and we, like many other LSPs guarantee, a desired level of quality to all translations independently of how the translations were produced.

Think about time and cost – MT enables translation delivery quicker and cheaper than without MT.

A big ‘thank you’ to Maxim for taking time out of his busy schedule to take part in this interview, and we look forward to hearing more from Maxim during the KantanMT/bmmt joint webinar ‘5 Challenges of Scaling Localization Workflows for the 21st Century’ on Thursday November 20th (4pm GMT, 5pm CET and 8am PST).

Register here for the webinar or to receive a copy of the recording. If you have any questions about the services offered from either bmmt or KantanMT please contact: