From the author of

From the author of

Telephones and cell phones are everywhere. Users call from wherever they are
and call whenever they want to. But when calling businesses, callers frequently
are put on hold. To overcome the long hold queues, several businesses have
installed IVR systems that collect data by asking the callers carefully
formatted questions. Some systems require callers to answer questions by
pressing the buttons on touchtone phones. These systems have several
disadvantages:

Callers constantly must reposition the phone between their ears and the
front of their faces so they can see the buttons that they must press. Speech
recognition removes the need to constantly move the telephone.

Callers must translate options to numbers. For example, "For
accounting, press 1; for human resources, press 2; for sales, press 3..."
This requires callers to select the appropriate option, and translate that
option to a number before pressing the appropriate button on the telephone
keypad. Speech recognition simplifies this process. Callers simply speak the
answer rather than trying to remember the options, selecting the best option,
and, finally translating the selection to a number.

Because of the limitations of human short-term memory, developers
structure menus to be long and narrow rather than short and fat. Callers often
"get lost" in these long menu hierarchies and cannot find their way to
the desired option.

Using speech-recognition technology solves these problems, but creates some
new ones. There are always problems with new technology. Speech recognition is
no exception. This article discusses three problems with using telephones to
enter data by speakingand what you can to do about them.

Problem 1: Callers Don't Know When to Speak and What to Say

Currently, many callers do not have experience using a telephony application
with automatic speech recognition. They don't know that they can speak.
Often, they are "tongue-tied" about what to say. Here are some hints
to help callers say the right thing at the right time:

Inform the caller that they may speak. At the beginning of the
application, inform the caller that the application can understand human speech
and that he should respond to questions by speaking the answers. For
example:

"Welcome to the Ajax banking application. You may answer questions by
speaking directly into your phone."

Encourage the caller to respond to a prompt by speaking. Phrase
the prompt to encourage the caller to speak. Use words in the prompt such as
"say" or "speak" instead of "enter." For example:

"Say your name."

rather than

"Enter your name."

Tell the caller what to say. Prompts should lead the callers to
say words and phrases in the corresponding grammar. If the caller is not
familiar with the appropriate words and phrases, include them as part of the
prompt. For example:

"Which account? Savings or checking?"

If the caller is already familiar with the individual words, shorten the
prompt. For example:

Encourage experienced callers to barge in. Novice callers usually
listen to the entire prompt. They need to hear all of the instructions and
options before making their selection. However, experienced callers may resent
listening to complete prompts, especially if they use the application
frequently. In conversations between people, barging-in may be rude. However,
computers are never insulted when callers barge in. Inform callers that they may
bypass lengthy prompts by "barging-in"speaking before the prompt
ends. For example:

"You may speak at any time, even if the computer is speaking."

Insert pauses in the prompt wording where expert, average, and novice callers
may speak. A pause signals speakers that they should speak. Callers with
different experience levels may barge in during different pauses.

"Color?"

(Pause, so experts can barge in here. They know the question and the
appropriate responses.)

"Say the color you want."

(Pause, so an average caller who already knows the allowable options can
barge in.)

"Green, red, or blue?"

(The novice caller responds after hearing the allowable options.)

Callers will quickly learn that barging-in will speed up a conversation, so
that callers can perform their desired tasks quickly.

Continue to encourage the user to speak. Sometimes, the user says
a word that is not covered by the grammar of allowable words. In these cases, a
useful strategy is to reveal additional information and instruction to the
caller each time the caller is prompted for the same information. For example:

Level 1. Present a short prompt, asking the caller to respond.

Level 2. Present a short description of what the caller should
say.

Level 3. Present an example of what the caller should do.

Level 4. Offer to present short segments of a verbal tutorial to the
caller, or transfer the caller to a human operator to resolve the caller's
problem.