Structured Prediction for Natural Language Processing

Slides

Goals

This tutorial will discuss the use of structured prediction
methods from machine learning in natural language processing. The
field of NLP has, in the past two decades, come to simultaneously rely on
and challenge the field of machine learning. Statistical methods now
dominate NLP, and have moved the field forward substantially, opening
up new possibilities for the exploitation of data in developing NLP
components and applications. However, formulations of NLP problems
are often simplified for computational or practical convenience, at
the expense of system performance. This tutorial aims to introduce
several structured prediction problems from NLP, current solutions,
and challenges that lie ahead. Applications in NLP are a mainstay at
ICML conferences; many ML researchers view NLP as a primary or
secondary application area of interest. This tutorial will help the
broader ML community understand this important application area, how
progress is measured, and the trade-offs that make it a challenge.

Topics

The tutorial will be broken into three parts. The outline below is ambitious; some topics may be referenced only in brief. We intend to give extensive references to important papers, so that participants can follow the leads that are most interesting.

Representations and Data

We will discuss NLP tasks that can be seen as structured prediction
problems. These include sequence segmentation and labeling, syntactic
parsing, and translation discovery. We focus on the representation of
these problems, with some discussion of the data that might be
required for each.

Decoding

We consider a key abstract inference problem
that turns up frequently in NLP: decoding, also known as maximum a posteriori
inference). We discuss common techniques for decoding.

Supervised structured NLP

We consider the case where training data are available for structured
learning. We discuss the relationship of grammars and automata to
structured prediction, the widespread use of dynamic programming, with some
specific examples. We discuss a variety of approaches to supervised learning of
structured prediction models.

Unsupervised structured NLP

We turn to some trends in unsupervised NLP, where we seek to learn
to predict structure that is not visible in the available data.
We consider the EM
algorithm, some successful models, and variations on EM, including
latent variables, contrastive estimation, and more Bayesian approaches.