5 Trends of Voice UI Design

At its core, the concept of interaction was always about communication. Human-Computer Interaction has never been about graphical user interfaces, which is why Voice User Interfaces (VUIs) are the future of user interface design.

An interface is just a medium people use to interact with a system—whether it’s a GUI, VUI or something else. So Why is VUI So Important? Two reasons:

Firstly, conversational interfaces are so fascinating because conversation is a form of communication everyone understands.

It’s a natural means of interaction. People associate voice communication with other people rather than with technology.

Users don’t need to learn to interpret any symbology or new terminology (the language of GUI), they can use English (or any other native language) to operate with a system. It doesn’t mean that users don’t have to learn how to use a system but the learning curve be reduced significantly.

Secondly, user expectations are changing. According to Statista, 39% of millennials use voice search. This audience is ready to be the early adopters of VUI systems.

Top 5 VUI Trends

When it comes to designing VUI, voice interaction represents the biggest UX challenge for designers since the birth of the original iPhone. But the great news is that the most fundamental principles of UI design that we use when creating products with GUI are still applicable to VUI design. Below you can find a few trends that will shape VUI design in next decades.

1. VUI That Builds Trust

Trust helps to build a bridge between a person and a machine. If trust is absent, users will be unlikely to interact with a particular voice user interface.

The importance of the valid outcome (VUI should give the person understanding that s/he will receive exactly what s/he requested). It’s possible to achieve this goal by focusing on the following things:

Focusing on understanding the user’s intent (a reason for interacting in the first place). When users interact with a system, they have a particular problem they want to solve, and the goal of the designer is to understand what this problem is.

Providing meaningful error messages.

Crafting contextually driven flows. While it’s impossible to predict all commands that users might ask the system, designers need to at least design a user flow that is contextually driven. The system should anticipate users’ intent at each point of a conversation and provide users with information on what they can do next. For example, finding a restaurant near the user. When users search for a restaurant, the system should match exactly what the user is looking for.

The importance of user control (one of the 10 Usability Heuristics for User Interface Design by Jakob Nielsen is still applicable to VUI design).

The system should consider the natural limitations of a human brain (short-term memory limitations). The information provided by the system should be overwhelming. When people hear the system response, most users remember only the last phrase. Thus, it’s better to stay away from long phrases or providing a dozen different options while the user can remember just a couple of them at one time.

The system should react to a user request with appropriate feedback. This feedback should give users a full understanding of what the system is doing right now. For example, visual feedback lets the user know that the system is ready and listening; or in POD (Process of Doing). When a user sends a request to the system, the system shows a POD. POD isn’t a loading animation, it doesn’t just state the fact that users have to wait while a system is doing something, it provides valuable information of what the system does. For example, a POD for a command on pulling out a file from Dropbox might look like as someone search for a right file in storage.

2. Adaptive User Interface

An adaptive user interface (also known as AUI) is a user interface (UI) which adapts to the needs of the user or context. VUI of the future will adapt for users — the system will analyze all information it has about users (including the information about current mental state and health condition) and their current context to provide more relevant responses to user requests.

For example, if a user has a high blood pressure at the current moment and decides to set a meeting in 2 hours, a digital assistant might suggest avoiding that, or suggest lowering blood pressure with exercise before the meeting starts.

3. VUI That Conveys Personality

Visual designers have a lot of options to introduce the personality in graphical user interfaces – fonts, color, illustration, motion, just to name a few. But what about VUI? Designers can convey personality using language itself — by playing with words, voice, and tone. Speaking of voice, a voice is part of the persona and it shapes its identity. Once we’ve associated a voice with something, it becomes part of its identity. And we experience emotions when we interact with such an interface, just like we when we interact with real people. People want human-understandable voices — not a voice that sounds human, but a voice that speaks coherently human!

Bad example: Siri voice by Susan Bennett – the voice that sounds almost human but people still know that it’s a machine. You can’t really have a dialogue with Siri. While you can ask Siri something like, “What is the weather like today?” You can’t ask more sophisticated questions such as, “What should I wear today?” As a result, you don’t have deep feelings for Siri, you know it’s just a robot.

Good example: Samantha voice from the film Her — the voice that sounds coherently human and people can be in love with it.

4. From Narrow AI Towards General Intelligence

Human-computer interactions are shifting to conversation, but users expect more. Most of AI systems available today are still limited to Narrow AI — such systems use Machine Learning to solve a clearly defined (and, in most cases, way too narrow) problem. Narrow AIs have zero knowledge outside of their training data. It means that when a user wants to solve a slightly different problem, or the problem itself evolves, the system won’t be able to solve it and it’ll respond with something like, “I don’t understand.” So that you, as a user, face a wall.

In comparison to Narrow AI, General Intelligence is not limited to narrow domains. The concept of learning is at the foundation of GI systems — the fundamental difference between Narrow AI and General AI is that the General Intelligence systems learn without being expressly programmed (machines learn by themselves). GI system uses two types of learning — reinforcement learning (when a system uses all available information to solve a particular user problem) and supervised learning (when a system needs user assistance to solve a problem for the first time). Another difference is that a General AI system can learn to utilize other AI for general and specific purposes. As a result, different Machine Learning models can be trained dependently and work cooperatively. An advanced NLP GI system is able to learn from the first attempt by combining and processing information from multiple different data sources.

5. Impact on Society

Widespread acceptance of VUI systems. Improving the quality of VUI AI-based systems will lead to better user engagement. The relationships between human and computer will be interactive and collaborative — people and computers will work together. This will impact society — just imagine that in ten years, you’ll walk into the house and just talk and control all kinds of machines.

This future will be with omnipresent AI: As users, we’ll trust AI even with the most important decisions such as “What school should I choose for my children?” VUI will improve the quality of life of older people and people with disabilities.

Conclusion

“The best interface is no interface“ is a famous quote of Golden Krishna, the author of the book The Best Interface Is No Interface. He and many other designers believe that people don’t want more time with screens, in fact they want less. Thus, technology should stop celebrating screen-based solutions. And it’ll happen relatively soon — the interactions of the future won’t be made of buttons.

With the rise of computer processing power, we’ll have more systems that will be able to calculate up to 1000 steps in 1 second. A user and a machine will work together, enabling General Intelligence.