Introduction to MT October 28, Morning

This tutorial is for people who are beginning their journey with machine translation and want an overview of what it is, how it works, how it can be used, and whether it can fulfil their needs. No previous knowledge of machine translation is assumed, and all levels of skepticism are welcome. The focus will be on providing background knowledge that will help you get more out of the rest of the AMTA conference and make more informed decisions about how to use or invest in machine translation. Past participants have ranged from translation professionals who want to understand changes in their field to corporate executives who are evaluating technology strategies for their organizations. The main topics for discussion are common questions about MT (What is MT and how does it differ from other translation technologies? How well can machines really translate? What are the latest trends in MT research and development?), the quality of the translations it produces (Why is the output sometimes so bad? How can the quality be improved? Can translation quality be measured objectively?), and its application (What is MT good for? Have we reached the point of Star Trek’s universal translator? Will we? Can MT improve a translator’s efficiency? What are the implications of this technology for translators?). You will leave this tutorial with the tools you need to take part in an informed discussion of MT.

Presenter: Jay Marciano

Syntax-based translation models learn translation patterns from recursive structures over sentences. Compared to phrase-based models (Koehn et al., 2003) they have a better capability of long-distance reordering and generalization, especially for MT between distant languages. Constituent structures have been widely used in statistical machine translation (SMT), but translation models built on constituent structures are computationally complex and inefficient when multi-level rules are taken into consideration. By contrast, dependency structures do not attract enough attention in SMT, although dependency grammar is regarded to be very helpful because it directly encodes semantic information and has the best inter-lingual phrasal cohesion properties (Fox, 2002).
In this tutorial, we will introduce representative work on dependency-based SMT, including:

— Translation models based on segmentation:

o Dependency treelet models

o Dependency graph segmentation models

— Translation models based on synchronous grammars:

o String-to-dependency models

o Dependency-to-string models

o Dependency-graph-to-string models

— Dependency-based evaluation

— Lab session with our open source tools

Presenters: Qun Liu and Liangyou Li

Moving beyond post-editing machine translation, a number of recent research efforts have advanced computer aided translation methods that allow for more interactivity, richer information such as confidence scores, and the completed feedback loop of instant adaptation of machine translation models to user translations. This tutorial will explain the main techniques for several aspects of computer aided translation:

For each of these, the state of the art and open challenges are presented.

The tutorial will also look under the hood of the open source CASMACAT toolkit that is based on MATECAT, and available as a “Home Edition” to be installed on a desktop machine. The target audience of this tutorial are researchers interested in computer aided machine translation and practitioners who want to use or deploy advanced CAT technology.

Presenter: Philipp Koehn

Advances in Neural Machine Translation November 1, Afternoon

Neural machine translation has been introduced to the field of natural language processing and machine translation. Unlike existing approaches to machine translation, neural machine translation tackles the problem of translation by directly modelling the conditional probability of a translation given a source sentence without any assumption on factorization. Already in two years, neural machine translation has proven itself to be competitive against the existing translation approaches in many language pairs, which has excited many researchers in the field. In the first part of this tutorial, I will give an introduction to neural machine translation together with basics in connectionist natural language processing. This will be followed by describing new opportunities in machine translation that have become possible by introducing deep learning to machine translation. These opportunities include sub-word/character-level translation, multilingual translation and simultaneous machine translation.

ModernMT November 1, Afternoon

The ModernMT project aims at making a contribution to the evolution of machine translation. Our goal is to consolidate the current state­ of ­the ­art technology into a single easy-to-­use product, evolving it and keeping it open to integrate the next greatest opportunities in machine intelligence, like deep learning.

In particular, we will introduce and demonstrate the ModernMT system architecture, whose distinguishing features make it particularly useful to be integrated in the Computer-Assisted Translation framework, namely its capability to (i) adapt in real time to the document to be translated and (ii) quickly learn from data provided by the users, such as translation memories and post-edited data. The MMT architecture builds on open source software (Moses, Lucene, etc.) and comes as a-ready-to-install application, that does not require any initial training phase and that enables scalability of data and users.

Our tutorial will cover the following aspects of the ModernMT project:

— Introduction: main features and development roadmap

— Development: software architecture and components

— Field testing: ongoing testing activities with the industry and results

— Integration: deployment and integration of ModernMT in CAT tools

— Hands ­on session: participants will learn how to install, start, and use ModernMT