Computational Pragmatics (CompPrag2016)

Computational pragmatics can be understood in two different senses. First, it can be seen as a subfield of computational linguistics, in which it has a longer tradition. Example phenomena addressed in this tradition are: computational models of implicature, dialogue act planning, discourse structuring, coreference resolution (Bunt & Black 2000, and others). Second, it can refer to a rapidly growing field at the interface between linguistics, cognitive science and artificial intelligence. An example is the rational speech act model (Frank & Goodman 2012) which uses Bayesian methods for modeling cognitive aspects of the interpretation of sentence fragments and implicatures. Computational pragmatics is of growing interest to linguistic pragmatics, first, due to the availability of theories that are precise enough to form the basis of NLP systems (e.g. game theoretic pragmatics, SDRT, RST), and second, due to the additional opportunities which computational pragmatics provides for advanced experimental testing of pragmatic theories. As such, it enhances theoretical, experimental and corpus-based approaches to pragmatics.

In this workshop, we want to bring together researchers from both branches of computational linguistics, as well as linguists with an interest in formal approaches to pragmatics. Topics of the workshop include, but are not limited to, the following issues:

Program

Wednesday, 24.2.

14:00 - 15:00

As a sequel to Bunt and Black (2000), which presented a characterization of the field of computational pragmatics and a survey of its main issues, this paper discusses some of the most interesting developments in the field in the last 15 years. Current research is dependent on large-scale annotated corpora. The paper includes an overview of such corpora and accompanying software tools. Of the pragmatic phenomena that have received attention in such corpora, the use of dialogue acts in spoken interaction stands out. Dialogue acts, which have become popular for modeling the use of language as the performance of actions in context, are realized by ‘functional segments’ of communicative behavior; these may be discontinuous, may overlap, and may contain parts contributed by different speakers.
Based on the DIT++ taxonomy of dialogue acts, the ISO 24617-2 standard for dialogue act annotation has been defined, including the Dialogue Act Markup Language DiAML, which supports the annotation of functional segments with multiple communicative functions, type of semantic content, speaker and addressee(s), functional and feedback dependences, pragmatic qualifiers, and rhetorical relations. The context-update semantics of DiAML accounts for inference relations among dialogue acts.
Computational pragmatics contributes to dealing with the fundamental challenge of pragmatics to understand how language interacts with context by providing computational models of interpretation, generation, inferencing and learning. What is still missing, however, is the use of powerful context models. Much of the work that takes context information into account considers only the linguistic context, i.e. the preceding discourse. This is virtually the only kind of context information that is available in corpora, and therefore for applying machine learning techniques. As a result only a fraction of the relevant context information is taken into consideration. Ideally, dialogue and discourse corpora should include information from richer context models including e.g. speaker and hearer beliefs, mutual beliefs, communicative goals, multimodal perceptual information, and social relations. Manual addition of this information in corpora hardly seems feasible, in view of its complexity, therefore a challenge for computational pragmatics is the development of new computational tools to make this feasible.

15:00 - 16:00

Mixed motives represent a mixture of congruent, i.e., joint motives as well as incongruent, partially conflictive motives of interlocutors in dialogues. Motives refer to objectives or situations that interlocutors would like to accomplish in the sense of a motivational state. As mixed-motive dialogues we describe all grades between collaborative dialogues with exclusively congruent interlocutors’ motives, e.g., when solving a PC problem together, and non-collaborative dialogues with purely incongruent motives of dialogue participants, e.g., in a pro/contra debate. Adopting the idea of mixed-motive games by Schelling, we consider these dialogues as situations in which participants are faced with a conflict between their motives to cooperate and to compete with each other, e.g., in sales conversation, where bargainers have to make concessions to establish a compromise agreement, but at the same time, they must compete to achieve a good bargain. In everyday life, interlocutors are able to solve this conflict between cooperation and competition with trade-offs between selfishness and fair play for creating dialogues perceived as fair.
Despite of the overall presence of mixed-motive dialogues in everyday life, little attention has been given to this topic in dialogue planning in contrast to scrutinized collaborative as well as non-collaborative dialogues. Therefore, the support of these rarely considered dialogue type by dialogue systems in real-world environments is still a challenge.
Our objective is the investigation of dialogue systems that support mixed-motive dialogues between users and indirect, absent interlocutors, for instance customers and retailers in sales conversations. Adopted motives by indirect interlocutors as well as anticipated motives by users constitute mixed motives that are processed by the dialogue system when generating answers to posed questions. Since complete satisfaction of all motives by all interlocutors at any point in mixed-motive dialogues is not possible, we draw the concept of satisficing by Herbert Simon (1956) for capturing the idea of finding the best alternative available in the sense of a sufficient satisfaction of motives by all interlocutors. Therefore, satisficing answers are planned that lead to mixed-motive dialogues perceived as fair by all interlocutors regarding the absolute and relative satisfaction of their motives. Restricted to question-answering settings, our contribution is an approach for satisficing answer planning in mixed-motive QA dialogues by means of a game-theoretical equilibrium approach. Based on the proposed approach, we implemented a text-based QA system that provides a sales assistant in an online shopping scenario. The validity of the approach was evaluated in an empirical end user study (n=120) with the QA system with promising results.

16:00 - 16:30

Coffee break

16:30 - 17:30

In this talk, we will report on our research on interactive natural language generation (NLG) in the context of situated dialogue systems. In situated communication, the meaning of a sentence is always relative to the environment in which it is uttered, requiring us to model both in parallel. Within this task, we are particularly interested in generating referring expressions (REs), i.e. of noun phrases that identify a given object effectively within the scene.
More specifically, our research on situated NLG focuses on generating instructions that help a human user solve a given task in a virtual 3D environment. This domain has the advantage of a technical complexity and reliability that is greatly reduced compared to situated communication in real-life environments. Furthermore, data collection and evaluation can be done with experimental subjects that are recruited over the Internet. We will report on the GIVE Challenge, an NLG evaluation challenge organized by our group which is built on top of this idea. Since 2009 we have developed a set of tools capable of recording, analyzing, and modeling user behavior around this challenge scenario, and have collected hundreds of hours of interactions between NLG systems and experiment subjects, which we can use to train and evaluate our systems. Crowdsourcing, the practice of collecting data from participants all over the world, allows us today to test new hypotheses in a cheap and efficient manner.
Next to this training and evaluation setting, we will also report on our work on the interactive, situated generation of REs. Generating REs, reacting to misunderstandings and establishing common ground are some of the pragmatic phenomena that we must take into account. We developed a data-driven approach that allows us to generate the "best" RE for any given situation. Unlike some earlier research, we take "best" to mean the RE that maximizes the chance that the listener will understand the RE correctly. We then exploit the interactivity of the environment by tracking the listener's behavior in the virtual environment in real time. We have implemented a system that detects automatically whether the listener has understood the RE correctly, and generates corrective feedback if a misunderstanding occurred.

17:30 - 18:30

In this talk I will, firstly, summarise the state of the art of the Generation of
Referring Expressions, viewed as the construction of computational models
of human reference production; in this first part of the talk, I will ask what
algorithms in this area are able to do well and what it is that they still
struggle to do. In the second part of the talk, I will argue that the most
difficult problems for the Generation of Referring Expressions arise from
situations in which reference is something other than the ''simple''
identification of a referent by means of knowledge that the speaker shares
with the hearer; I will give examples of these epistemically problematic
situations and of the generation algorithms that try to address them. The
talk offers a sneak preview of my book ''Computational Models of
Referring: a Study in Cognitive Science'', which is soon to appear with MIT
Press.

Thursday, 25.2.

09:00 - 10:00

Consider the problem of generating and intepreting non-literal utterances
in the context of (1), where "Rewe'' and "Edeka'' are supermarkets.

Q: Does Rewe sell turnips?

A:

Edeka sells turnips.

?Rewe sells carrots.

\#Rewe sells soap.

Intuitively, (1-a) is licensed by the presumption that the questioner/hearer
wants to buy turnips, and conveying that Edeka sells them would be
helpful in accomplishing this goal. But why wouldn't the hearer have
simply asked, "where can I get some turnips?'' A strategy for answering
that wh-question by breaking it down into yes/no sub-questions (see
Büring 2003) makes sense if two conditions are met. First, the questioner
expects the answerer to supply a single candidate store, rather than an
exhaustive list. Second, the questioner has a preferred outcome: perhaps
for reasons of convenience or price, he/she would rather go to Rewe for
turnips. Asking about Rewe first avoids an outcome where the questioner
is led to a sub-optimal supermarket. In this case, a helpful answerer does
well to supply the alternative in (1-a), but only in the case where Rewe
does not sell turnips. With this in mind, the hearer will draw the
implicature from (1-a) that Rewe does not sell turnips. A similar implicature
can be drawn from (1-b), but one gets the intuition that (1-a) is a better
answer than (1-b). And more clearly, (1-c) is downright infelicitous. This
should fall out as a direct consequence of how (un)likely it is that the
alternatives supplied help accomplish the questioner's goal. Recently,
game theory has proven to be a useful formal tool for modeling reasoning
of this kind, and has begun to be applied to problems of language
generation in a computational setting (Stevens et al., 2015). We propose a
framework for developing methods to solve generation/interpretation
tasks in parallel using iterated game-theoretic reasoning over algorithms.
A discourse situation is modeled as a cooperative Bayesian game between
two interlocutors, taking into account their conversational and domainlevel
goals. The strategies are algorithms for generating and
interpreting/reacting to propositions. Starting with a principled default
speaker strategy, algorithms iteratively refined to better achieve the
players' goals until fixed point has been reached, à la Franke (2009).
Pragmatic inferences are made based on conditions on algorithm outputs.
We illustrate our approach by applying it to (1).

10:00 - 11:00

The question whether and when pragmatic enrichments, like scalar
implicatures, can occur in nonmatrix position is crucial for understanding
pragmatic inferences and processing in general. Here, we would like to
address the associated disambiguation problem (c.f. Chemla and Singh,
2014): any theory of implicature-like meaning enrichments should ideally
specify, for any sentence and context pair, which candidate readings are
preferred, and to what extent even dispreferred readings
may be selected.
With this goal in mind, we turn to probabilistic computational pragmatics,
which aims to bridge classical formal pragmatic theory and the demands
of empirical data analysis. In particular, we look at a joint-inference model
in which the listener infers, not only the most likely world state that could
have triggered the speaker’s utterance, but also the speaker’s intended
meaning, modeled here as a topic proposition (a special kind of question
under discussion). In keeping with previous probabilistic pragmatics
models that build on Frank and Goodman (2012)’s rational speech act
model, we define a chain of naive listener R0, Gricean speaker S1 and
pragmatic interpreter R2, where each next component builds on the
previous. The main innovation of this model is that the speaker’s choice of
utterance depends on a choice of topic proposition which in turn depends
on the actual world state. Speakers are assumed to select topic
propositions probabilistically, so that more informative (surprising)
propositions are more likely to be selected. Utterances should then make
the to-be-communicated topic proposition likely, given conventional
semantic meaning. Listeners then jointly infer world state and topic
proposition based on the utterance.
We show how this joint-inference model makes appealing predictions
about complex sentences with scalar implicature triggers in line with
recent empirical data about preference in disambiguation (Franke et al.,
2015). We also argue that the joint-inference model offers many
possibilities for linking model predictions to experimental conditions.

11:00 - 11:30

Coffee break

11:30 - 12:30

Probabilistic models of human cognition have been widely successful at
capturing the ways that people represent and reason with uncertain
knowledge. The Rational Speech Act framework uses probabilistic
modeling tools to formalize natural language understanding as social
reasoning: literal sentence meaning arises through probabilistic
conditioning, and pragmatic enrichment is the result of listeners reasoning
about cooperative speakers. I will consider how this framework provides a
theory of the role of context in language understanding. In particular I will
show that when uncertainty about the speaker is included in the pragmatic
inference several of the most subtle aspects of language emerge:
vagueness (in scalar adjectives and generics) and presupposition
accommodation.

Friday, 26.2.

11:30 - 12:00

We present an extension of LFG’s Abstract Knowledge Representation
(AKR) component that integrates a model of the Common Ground (CG) and
allows for the calculation of pragmatic inferences. The system uses a rule
set based on Gunglogson’s (2002) discourse model. We illustrate our
implementation with respect to a set of German discourse particles. These
particles arguably contribute information that is pertinent for the CG (e.g.,
Zimmerman 2011).
Our pragmatic parser for dialogues uses the existing AKR framework built
on top of LFG’s syntactic architecture (e.g., Bobrow et al. (2007) and
Crouch & King (2006)) within the XLE grammar development platform. The
platform integrates an XFR rewriting system that allows for packed
rewriting of XLE’s syntactic output. It produces semantic representations
that allow for Entailment & Contradiction detection (Bobrow et al. 2007).
We extend this component to produce a semantic/pragmatic
representation that is dynamically updatable for pragmatic reasoning. Our
system on the one hand enriches AKRs with pragmatically relevant
information, e.g. speaker, speech time, state of information in discourse.
On the other hand, we modified the ECD system such that it determines
discourse moves (conversational actions) and accordingly modifies the
AKR that represents the discourse.
To illustrate the system we use German discourse particles to demonstrate
how grammatical information interacts with pragmatic information.
Concretely, our pragmatic parser interprets the meaning that the German
discourse particles ja, doch and wohl add to utterances in discourse like
structures. We show how dynamic pragmatic inferencing takes place
within the AKR system based on the information coming from the particles.
In sum, we present an extension of a meaning component that has been
used for information retrieval and reasoning in a Question&Answer
system. Our extension provides a model of the CG and allows for dynamic
reasoning about the information in the CG. Furthermore, our system
provides a treatment of German discourse particles that is computationally
elegant and linguistically well motivated.

12:00 - 12:30

The work in progress presented here is a contribution to argumentation
mining in the German legal text domain. Focused in this abstract is the
building of a corpus of argumentative sequences and argumentation
structures of German legal decisions that will later provide models of these
layers for conditional random field-based sequence labelling and treekernels
for structure classification. Most related works are Mochales and
Moens 2011 and Stab et al. 2014. However, there is no corpus of German
legal decisions available and building a gold-standard corpus of this genre
will be an important addition to all related fields of research. The data
collection has been compiled from a free online service and consists of 100
private law decisions. For pre-processing, a genre-specific sentence
tokenizer has been trained. The annotation framework chosen for the
study is Webanno (Yimam et al. 2013).
The study divides into two subtasks: The first step is the labelling of all
argumentative sequences in the justification section of a decision
document on sentence level.
The second annotation task is to enrich each of the premises with
structural information on its local argumentative elements on word token
level.
Besides being part of the argumentation mining study, the corpus will
deliver valuable information for discourse related studies in the German
legal domain and can contribute to comparative studies among different
argumentative text genres.

12:30 - 13:00

Every morning, while reading the newspaper, we are faced with a lot of
different omissions, which we do not always even realize. We read
headlines as

Größte Dürre seit einem halben Jahrhundert

Kampfjet in Bayern abgestürzt (zeit.de)

In the first case, we consider an (def. or indef.) article is missing, as there
is an obligatory article before singular nouns in German. In 1 b., the
structure additionally lacks some copula verb, like "(Ein) Kampfjet ist in
Bayern abgestürzt''.
These kinds of ellipsis are found not only in headlines. We claim that we
can get a profile of text types on the basis of their distribution of ellipses.
Hereto, we built a corpus containing more than 10 different text types
(spoken and written language) to compare the patterns. A big challenge
was the right detection and annotation of the missing elements. How can
we more or less automatically find the missing article (<art>)?

Since the STTS doesn't distinguish between singular and plural nouns, the
query for patterns as in 2 a. gives no satisfying output. Cases like in 2(b)
further challenge the task. Finally, we did the annotation by hand for
having a reliable annotation.
Secondly, I want to discuss the different forms of article omissions in the
light of Information Theory. A central claim is that such omissions are a
way to "densify'' an utterance in order to reduce redundancy. For this
purpose we train Language Models on different text types and calculate
Information Density (i.e. -log2 P(w|c)) like in (3) for headlines.

In (3), the ID of the noun without preceding article (11.6582) is higher and
hence much more "dense'' than in the case of article realization (7.7953).
The ID of the article itself is quite low (4.6328). Thus, a puzzle I want to
address in the talk is, why the article sometimes is realized instead of its
omission -- and vice versa. Furthermore, which role plays the Uniform Information Hypothesis (e.g. Jaeger 2010) here?
A further aim of this (ongoing) work is to compare the values in different
text types. The claim of a certain "profile'' for each text type should be
reflected in different probability values, different ID profiles respectively.
The aim is to show first profiles and to discuss further possibilities in
CompPrag -- since there are some other ellipsis on hold.

13:00 - 13:30

In this paper, we use the phenomenon of 'embarrassed laughter' as a case
study of one approach to corpus pragmatics. We construct a set of
interlinked ontologies by comparing the transcription practice of various
collections of data as summarised by Hepburn and Varney (2013), making
explicit the implied knowledge underlying those transcription practices
about the characteristics of laughter which have been treated as
interactionally relevant. These ontologies allow us to see the essentially
combinatorial nature of certain pragmatic phenomena and therefore also
allow us to develop strategies for searching for relevant data. We then
proceed to illustrate how such search strategies can work with the
example of 'embarrassed laughter'. Such laughter often occurs early in an
interaction (especially first encounters) and following long pauses. We can
therefore establish a set of search criteria (laughter AND (start of
interaction OR long pause) to try to find possible instances of this
phenomenon in varied collections of data such as those which form part of
the Australian National Corpus. Our approach acknowledges the
complexity of the factors which may be relevant to the identification of
any pragmatic phenomenon without relying on the prior identification of
instances in any specific dataset and has the capability to generate
candidate sets of examples across varied data sets while relying on
features which are annotated in standard practice. We suggest that
looking for clusters of features which characterize pragmatic phenomena
and organizing our knowledge of the features with ontologies constitutes a
very promising approach in the field of corpus pragmatics.