Conversing with the User

As it converses with the user, the Dialogue Manager uses the
results of the Retrieval Engine's functions.
The system uses a frame containing a simple list of constraints
to support the interactive constraint-satisfaction search
(see Jurafsky et al. 1994 or Dowding et al. 1993
for a similar formulation). As is usual in this type of system, the user
can respond to a system request to fill a constraint by ignoring
that attribute and specifying the value for different one(s) instead
(Goddeau et al., 1996; Ward & Issar, 1996).
The
speech acts supported are listed in Table 2.

Table 2:
Speech acts supported in the ADAPTIVE PLACE ADVISOR.

System Speech Acts

ATTEMPT-CONSTRAIN

Asks a question to obtain a value for an attribute.

SUGGEST-RELAX

Asks a question to remove all values for an attribute.

RECOMMEND-ITEM

Recommends an item that satisfies the constraints.

QUIT-START-MOD

States that no matching items remain and asks whether

to modify the search, start over, or quit.

PROVIDE-VALUES

Lists a small set of values for an attribute.

CLARIFY

Asks a clarifying question.

User Speech Acts

PROVIDE-CONSTRAIN

Provides a value for an attribute.

ACCEPT

Accepts a relaxation suggestion or item generated by
the system.

REJECT

Rejects the system's proposed attribute, relaxation
attempt, or item.

PROVIDE-RELAX

Provides an attribute value for removal.

START-OVER

Indicates a desire to reinitialize the constraints and begin again.

QUIT

Indicates a desire to stop the conversation.

QUERY-VALUES

Asks for information about possible values of an attribute.

There are two main phases of the dialogue, the interactive
constraint-satisfaction portion and the item presentation portion.
The constraint-satisfaction portion is further divided into over- and
under-constrained situations.
The dialogue state
(Table 3) determines the system's utterance
and the range of responses expected at each point. The
system updates the
dialogue state's variables as appropriate throughout the conversation.

Table 3:
Dialogue State.

Variable

Description

Constrained

Attributes whose values have been specified.

Rejected

Attributes whose value the user has declined to provide.

Fixed

Constrained attributes that the user has indicated should not be relaxed.

Constrain

The next attribute to constrain, if any.

Relax

The next attribute to relax, if any.

Query

Probability model of desired item constraints.

Number-of-Items

Number of database items matching the query and
exceeding

the similarity threshold.

Ranked-Items

The matching items ranked in similarity order.

Rejected-Items

Items that the user has rejected.

User-Move

The user's most recently uttered speech act.

System-Act

The system's most recently uttered speech act.

In more detail, the system's speech act (or move)
during interactive constraint-satisfaction
is determined by the Number-of-Items dialogue state variable.
Further, its speech act determines which speech recognition grammar
to employ to interpret the user's next utterance.
The most common situation is when
many items (more than some small threshold, here three) match the
current constraints.8
In this situation, the system makes an
ATTEMPT-CONSTRAIN move, in which it asks the user to fill in the
value for an attribute. This move, if responded to appropriately by
the user, would reduce the number of items considered to be
satisfactory to the user. The attribute to Constrain is the
one ranked highest by the Retrieval Engine that has not already
been Constrained or Rejected. In our first sample
conversation, repeated in Table 4,

Table 4:
Sample Conversation.

utterances 2 and 6 illustrate
ATTEMPT-CONSTRAIN's by the system.
One user response to an ATTEMPT-CONSTRAIN is a
PROVIDE-CONSTRAIN, in which he provides a value for the specified
attribute or for additional attributes, as in utterances 5 and 7. A
second possible response is a REJECT, in which the
user indicates disinterest or dislike in an attribute, as in the first
part of utterance 7. As illustrated by some of these examples, the
user can combine more than one move in a single utterance.
A second situation, an over-constrained query, occurs when
there are no items that satisfy the agreed upon constraints and are
similar enough to the user's preferences, and thus the Retrieval
Engine returns an empty set (Number-of-Items = 0). In this case, the
system performs a SUGGEST-RELAX move that informs the user of the
situation and asks if he would like to relax a given constraint.
The attribute to Relax is chosen from the Retrieval Engine's
highest ranked attribute9 that has not already been Fixed.
This is illustrated in utterance 8 of the conversation in
Table 4. As in utterance 9 of that conversation, the
user can respond by rejecting ( REJECT) the system's suggestion
or he can accept it ( ACCEPT). In the former case, the attribute
is Fixed so that the system will not try to relax it again. In
combination with either of these speech acts, the user can specify
other attributes to relax in addition to, or instead of, the system-suggested attribute ( PROVIDE-RELAX).
When only a few items satisfy the constraints, the system ends the
interactive search and begins to suggest items to the user (
RECOMMEND-ITEM) in order of similarity, as in utterances 10 and 12
above. The user can either accept or reject an item. If the user
accepts an item ( ACCEPT), the system ends the conversation,
having reached the goal state. If the user rejects an item (
REJECT), the system presents an alternative, if any remain.
Note that there are three ``meanings'' for the REJECT speech acts of
the user, but only two ``meanings'' for the ACCEPT speech acts,
since a user has to accept an ATTEMPT-CONSTRAIN by providing an
explicit value for the attribute being constrained.
There are three special situations not covered by the above. The
first is when the query is over-constrained, but the user has Fixed
all attributes that could be relaxed. The second is when the user has
rejected all items that match the constraints. In these two
situations, the system informs the user of the situation, asks him
whether he would like to quit, start over, or modify the search (
QUIT-START-MOD), and reacts accordingly.
The third special situation is when Number-of-Items exceeds the
presentation threshold, but all attributes have been Constrained
or Rejected. In that case, the PLACE ADVISOR begins to present
items to the user.
To support the spoken natural language input and output, we use a
speech recognition package from Nuance Communications,
Inc. This package lets us write a different
recognition grammar for each of the situations described above
and to use human-recorded prompts (rather than text-to-speech). The
string of words recognized by the system is parsed using recognition
grammars that we wrote, which were used for all users without adaptation.
Future work could include personalized recognition grammars as well as
personalized information preferences. The grammars use semantic tags
to fill in each slot: besides slots for each attribute, we define
slots for rejection or acceptance of the system's suggestions. In
more complex domains, more sophisticated parsing methods may be
required, but this simple scheme gives the user a reasonably diverse
set of utterance options. The Nuance modules also generate a response to
user requests for help ( QUERY-VALUES) with a
PROVIDE-VALUES speech act, and enter clarification dialogues when
the confidence in a recognized utterance is below a given threshold
( CLARIFY). These are currently simple interactions where the
system provides examples of answers to the most recently uttered
prompt, or asks the user to repeat themselves.
Finally, for the item presentation portion of the dialogue only, the system
displays the restaurant information (name, address, and phone number) on
the screen, and outputs a spoken prompt such as ``How about this one?''
We chose this presentation modality due to our reluctance
to use text-to-speech generation and the large number of prompts we
would have had to record to produce spoken language for each restaurant.
However, note that the user still responds with a spoken reply, and we
do not feel that this presentation mode substantially influenced the
user-modeling behavior of the PLACE ADVISOR.
Each system-user interaction affects subsequent rounds of database
retrieval and similarity calculation via updates to the expanded query.
Table 5 shows the effects of relevant speech acts
on the query, which is in turn used in the similarity calculation as
described in Section 3.2. In the table, we have
shortened the names of some of the system moves for the sake of brevity.