How Will Google’s SyntaxNet Release Impact Machine Translation?

The Internet was abuzz on May 12, 2016 with news that Google now offers its language parsing tool free to the public. The newly open-sourced tool is called SyntaxNet, which is basically Natural Language Understanding (NLU) software that takes the first step, if you will, in understanding a sentence: parsing it into trees.

Even mainstream media picked up the story, with the The Daily Telegraphmarvelling at the possibility of creating a ”bolloxometer” from SyntaxNet: “No more lying politicians…No more advertisers hoodwinking us into buying their products. Just switch on the bolloxometer and watch the truth unfold.”

Advertisement

Google explains what SyntaxNet does with diagrams accompanying the news in the company blog, which says the release “includes all the code needed to train new SyntaxNet models on your own data” along with the fancifully named Parsey McParseface, “an English parser that we have trained for you and that you can use to analyze English text.”

SyntaxNet runs atop Google’s machine-learning software library called TensorFlow, the code for which the company open-sourced in November 2015. One article says all these open-sourcing is Google’s way of accelerating natural language research.

So will it accelerate natural language processing (NLP) as it applies to translation and localization? Could it impact machine translation technology and could MT applications leverage SyntaxNet?

What the Experts Say

Slator reached out to industry experts who, although they admit that the SyntaxNet release is good news and could be useful, caution against expecting too much.

“The release of SyntaxNet enables a larger pool of application developers to gain access to a state-of-the-art English parser,” says Daniel Marcu, co-Founder of Language Weaver, former SDL Chief Science Officer and CTO, and founder of the company behind FairTradeTranslation.com.

Marcu calls the SyntaxNet release, “great news for natural language processing engineers, who know how to take advantage of such a tool.” He qualifies, however, that “although the release has generated tremendous excitement in the media, I doubt it will impact the state of machine translation technology.”

“SyntaxNet is already used in Google Translate for the language directions, where a syntactic analysis can contribute to improved outputs,” Marcu explains, adding that “the public release of SyntaxNet will not lead to better Google Translate outputs.”

Other experts Slator spoke to concur. While parsers can certainly be used in MT systems, one said, developments in syntactic parsing do not necessarily imply translation quality gains. The same source pointed out that SyntaxNet has actually had “a rather disappointing track record in MT research.”

The company’s goal was simply to share the best parser it has with the public―Google source

Marcu says other MT engine providers, such as Microsoft Research and SDL, have been using SyntaxNet-like component technologies for over a decade. NLP industry analyst Seth Grimes agrees, pointing out even Stanford has a widely used parser.

“In translation, one has to convey both syntax and meaning. An understanding of grammatical elements such as voice, tense, case, declension, can help one map syntax correctly―but it won’t help with semantics or idiom,” Grimes says.

Meanwhile, Marcu says that since “the accuracy gains afforded by SyntaxNet do not translate into user-observable gains in MT quality,” its release is unlikely to significantly impact MT quality elsewhere.

“All in all, a great day for application developers,” who fully understand parsing technology’s strengths and limits, “but not a game-changer for the field of machine translation,” Marcu says. Besides, he adds, “there is too much AI-related hype nowadays,” which could, ultimately, “hurt the industry.”

A source at Google with whom Slator communicated admitted that, although SyntaxNet may not immediately improve MT products per se, the company’s goal was simply to share the best parser it has with the public. And this parser is much more broadly applicable rather than only to MT.

The same source also said Google uses a lot of other NLP tools internally of which this parser is just one part.