Putting dialog systems through the customer experience test

For two out of three executives, offering a compelling customer experience (CX) is key to their enterprise strategy. That’s the result of a recent poll of 300 managers, and their number keeps growing. For a good reason: Only when you focus on the CX can you convert satisfied customers into loyal ones and make them enthusiastic “brand ambassadors.”

“The concept of Total Customer Experience was coined back in 2002 by Berry/Carbone/Haeckel and describes a new way of looking at how companies provide services: as a customer’s experiential journey along the entire path to purchase and beyond (…) In order to do that one needs to look — from the perspective of the customer — at all their touch points with the enterprise. All ‘hints’ from the customer should be used to improve their experience of the corporate performance.” [1]

Deutsche Telekom has put creating an optimal customer experience front and center, making customer research a crucial component of the eLIZA scrum process. Since there are no long-term results to draw from in this field of AI, it’s particularly important to register the customers’ wishes, needs and expectations and incorporate their feedback.

Take the development of speech dialog or IVR systems (short for Interactive Voice Response). We recently surveyed users, customers as well as call center employees whose daily lives these intelligent systems are supposed to improve.

Here’s how Susanne Lebkücher from the product innovation department describes the task: “Many calls are repetitive — for example, one out of three calls deal with billing questions. The average call lasts 290 seconds. So it takes almost five minutes until the customer’s problem is solved. There’s a lot of room for improvement. Ideally, the computer system should support a call center agent and make better use of the hold time to answer simple issues. It gives service team members the opportunity to spend more time on addressing complex questions.”

What’s in it for the caller? He or she receives an instant answer to their question, without a wait, 24/7. Even better, the dialog system is always in a good mood and polite. “If the automated system can’t help you, it will escalate the call to an agent. He or she will get the relevant info that the caller provided to the AI to find a quick solution to the problem,” says Lebkücher.

Putting it to the test

For a few years now, T-Mobile Austria has been using a digital assistant that customers become familiar with. They can communicate with Tinka via www.t-mobile.at. But to let callers to the T-Mobile hotline also interact with Tinka, she needs to learn how to talk.

Seven participants recruited through Mindtake [SH: not sure if that’s correct. The German copy was ambiguous as to what Mindtake’s role is.] for this real-world test are all T-Mobile Austria customers who have used the hotline during the past 12 months. They all have a mobile contract, are between 18 and 60 years of age, representative of the demographic distribution of their various age groups. Most of them have used IVR systems before.

Since testing voice interfaces is a relatively new field, the first trial run happened back in May. The goal was to validate the technical set-up and testing methods. In-house testing put three different ways of opening a customer conversation through their paces, as well as three versions how Tinka will ask why someone’s calling.

Lebkücher’s summary: “The results gave us a clear picture how the customers’ wishes often contradict each other. On the one hand, they want personalized help, but they don’t want to wait too long for it. Some customers may feel embarrassed to come across as ‘ignorant’ when talking to an agent.” Thanks to the preliminary tests, Lebkücher and her team could also optimize the flow and change the call funnel to give better answers.

Here’s how the expert describes the process: “We test a recorded dialog and simulate a phone call. It’s about a real question and answer setting.” The simulation, in other words, doesn't ask: “If you have questions about your bill, press one. If you want to order a new device, press two. Instead, the system answers questions and solves problems. “Of course it would be too labor intensive to use a completely programmed AI, that’s why we use a Wizard of Oz method. Testers assume they are communicating with an autonomous system while in reality an employee reads the text to them. Testers run through the entire dialog flow once before an interviewer will ask them about their impressions and documents their answers and statements. If desired, we will analyze the flow in detail.”

Armed with this knowledge, the scrum team can keep tweaking Tinka. For instance that she ought to be polite and say “Please” and “Thank you” and understand all Austrian dialects without speaking one herself. Those are the kinds of insights that help improve Tinka.

Calling on the call center agents

A focus group with hotline team members is another opportunity to make sure Tinka provides the best customer experience possible. They are the experts when it comes to talking with customers since they’ve been doing it for years day in and day out. A focus group has the advantage that team members can talk with each other, nudge the discussion along and cover as many topics as possible. “It’s fun and motivating to chat with your coworkers,” adds Lebkücher.

After an introductory round, the moderator explains the topic to the participants. A speech-enabled computer system will be used to reduce the call queue and help customers with the most common issues so agents can focus on the more difficult cases. Agents will then discuss questions such as: What topics would they like to hand over to an automated system? Which questions keep popping up and can be delegated?

Those discussions quickly drove home a few key points. What matters most to call center teams is that they give customers the best advice. Billing questions are among their favorite topics because they can demonstrate their knowledge. Such questions require a lot of advice, according to the agents, suggesting that it would be helpful to delegate other topics. Another insight: the hold queue could be used more effectively to lighten the agents’ workload, say by better narrowing down the reason for the call and identifying the customer before they talk to them.

The focus group also helped distill different customers types such as the Choleric, the Shy One and the Relaxed One. “It’s remarkable to me that agents sometimes use dialect to respond in a more personal way to a caller. That’s a clear case where a machine could not replace a human,” says Lebkücher.

“As with the other survey, the goal was to put the customer at the center and ask questions that highlight the customer needs. At the same time, we need the insights and perspective of the agents. Taken together, we can at the same problem from two angles, much like the two sides of a coin.” It also makes a big difference in what mood a caller is. Do they want to vent or are they matter-of-fact serious about solving the problem at hand? How does the agent respond to them, and what strategies do they have to calm a caller down?

Honest machines

“We need to know how a call coming into a call center is handled if we want to design an AI-enabled dialog system that really works and makes sense,” Lebkücher explains. “That means we have to know what the ideal, simple as well as the most complex call looks like. Our AI will go to work somewhere between those two extremes. That’s why we ask our experts which billing questions are particularly labor intensive and which ones only require a quick fix.”

It’s hard to teach a machine how to intuitively adapt to almost any situation — a skill humans excel at. “We shouldn’t even suggest it. We can’t pretend the dialog system is a human,” Lebkücher is convinced. “No matter how well a machine copies human behavior, you’ll always feel that it’s non-human. That’s where we have to, and want to, stay authentic and honest.”