An associate professor named Hal Daumé III.
He wields appointments in
Computer Science and
Language Science at
UMD where
he and his wonderful adviseesstudy
questions related to how to get machines to becomes more adept at
human language, by developing models and algorithms that allow them
to learn from data. (Keywords: natural language processing and machine
learning.)
The two major questions that really drive their research these days are:

(1) how can we get computers to learn language
through natural interaction with people/users?

and (2) how can we do this in a way that promotes fairness,
transparency and explainability in the learned models?

He's discussed interactive learning informally recently in a Talking Machines Podcost
and more technically in recent talks;
and has discussed fairness/bias in broad terms in a recent blog post.
Hal is committed to promoting an inclusive
scientific environment; if you are thinking of inviting him for a talk
or to participate in an event, please ensure that the event is
consistent with this goal (see the first question on the FAQ).

We create a new online reduction of multiclass classiﬁcation to binary classiﬁcation for which training and prediction time scale logarithmically with the number of classes. We show that several simple techniques give rise to an algorithm that can compete with one-against-all in both space and predictive power while offering exponential improvements in speed when the number of classes is large.

Understanding inter-character relationships is fundamental for understanding character intentions and goals in a narra- tive. This paper addresses unsupervised modeling of relation- ships between characters. We model relationships as dynamic phenomenon, represented as evolving sequences of latent states empirically learned from data. Unlike most previous work our approach is completely unsupervised. This enables data-driven inference of inter-character relationship types be- yond simple sentiment polarities, by incorporating lexical and semantic representations, and leveraging large quantities of raw text. We present three models based on rich sets of lin- guistic features that capture various cues about relationships. We compare these models with existing techniques and also demonstrate that relationship categories learned by our model are semantically coherent.

We design an active learning algorithm for cost-sensitive multiclass classiﬁcation: problems where different errors have different costs. Our algorithm, COAL, makes predictions by regressing on each label’s cost and predicting the smallest. On a new example, it uses a set of regressors that perform well on past data to estimate possible costs for each label. It queries only the labels that could be the best, ignoring the sure losers. We prove COAL can be efﬁciently implemented for any regression family that admits squared loss optimization; it also enjoys strong guarantees with respect to predictive performance and labeling effort. We empirically compare COAL to passive learning, showing signiﬁcant improvements in labeling effort and test cost.

What is the story of an image? What is the relationship between pictures, language, and information we can extract using state of the art computational recognition systems? In an attempt to address both of these questions, we explore methods for retrieving and generating natural language descriptions for images. Ideally, we would like our generated textual descriptions (captions) to both sound like a person wrote them, and also remain true to the image content. To do this we develop data-driven approaches for image description generation, using retrieval-based techniques to gather either: (a) whole captions associated with a visually similar image, or (b) relevant bits of text (phrases)

Training discriminative rule selection models is usually expensive because of the very large size of the hierarchical grammar. Previous approaches reduced the training costs either by (i) using models that are local to the source side of the rules or (ii) by heavily pruning out negative samples. Moreover, all previous evaluations were performed on small scale translation tasks, containing at most 250,000 sentence pairs. We propose two contributions to discriminative rule selection. First, we test previous approaches on two French-English translation tasks in domains for which only limited resources are available and show that they fail to improve translation quality. To improve on such tasks, we propose a rule selection model that is (i) global with rich label-dependent features (ii) trained with all available negative samples. Our global model yields signiﬁcant improvements, up to 1 BLEU point, over previously proposed rule selection models. Second, we successfully scale rule selection models to large translation tasks but have so far failed to produce signiﬁcant improvements in BLEU on these tasks.