Using Voice Interfaces to Make Products More Inclusive

Executive Summary

Whether you’re creating services, physical products, or software, making products accessible to, and usable by, as many people as possible is essential. Voice user interfaces (VUIs) are a terrific tool for accomplishing this goal. An estimated 62 million people in the U.S. have motor or mobility impairments. VUIs are now available on hundreds of millions of devices, and are already being used to assist people with managing communication such as emails and texting, and controlling home devices. It’s not just about physical impairments, however. For people with dementia, voice assistants are a non-judgmental way to help with medication reminders, as well as with answering the same questions over and over without having to ask their caregiver. The benefits of voice technology and inclusive designs, however, make sense for everyone. Many of us experience temporary impairments. For example, someone carrying groceries (or a baby), who is temporarily unable to use their hands, or whose speech is impaired during a heavy cold or flu, can be aided by voice-enabled AI. Even something as simple as forgetting your reading glasses can mean you can’t read your phone; being able to use voice in these cases can really make a difference. Voice technology truly can help all of us.

Sunny/Getty Images

Whether you’re creating services, physical products, or software, inclusive design is essential. Inclusive design means making products accessible to, and usable by, as many people as possible. Voice user interfaces (VUIs) are a terrific tool for accomplishing this goal.

VUIs have been mainstream for more than 20 years, beginning with the first IVR (Interactive Voice Response) automated phone systems. Originally, this technology was designed to help companies save money, as human agents are more costly than automated systems. Its development was more about the bottom line than altruism.

We are now in what I call the “second era” of voice user interfaces, thanks to the advent of the smart speaker in 2014. VUIs are now used in many more places than just automated phone systems, and include voice assistants such as Google Assistant, Amazon Echo, Bixby, Cortana, Siri, etc. Today’s VUIs use a variety of AI-powered techniques, such as improved speech recognition and natural language processing. At first, smart speakers offered only basic functionality, such as setting timers and playing music. In the years since, they’ve expanded to encompass more complex interactions such as turning on household devices, assisting with cooking, and providing entertainment.

Although these types of devices provide convenience to the average user by enabling a more frictionless and natural experience, the future of VUIs includes a much bigger stage: empowering the people who need it the most.

An estimated 62 million people in the U.S. have motor or mobility impairments. VUIs are now available on hundreds of millions of devices, including smart speakers, headphones, and watches, and are already being used to assist people with managing communication such as emails and texting, and controlling home devices. They are even helping people with muscular dystrophy use their voice to adjust their bed throughout the night to avoid bed sores.

Insight Center

For the 285 million people in the world who are visually impaired, VUIs can bring independence and dignity, as well. Using their voice can make it easier for visually impaired people to find their smartphone when it’s lost, to listen to music without having to keep trying various CDs in the stereo, and to find out how much time is left on a timer. As technologists, we often think about the big things (helping people get dressed, get fed, move around, etc.) but often forget about the little things many of us take for granted, such as channel surfing. VUIs can assist with the sorts of tasks that make people feel more engaged in their lives.

It’s not just about physical impairments, however. For people with dementia, voice assistants are a non-judgmental way to help with medication reminders, as well as with answering the same questions over and over without having to ask their caregiver, who may already be emotionally exhausted.

It’s also a wonderful way for older folks who may not own a smartphone or laptop (or don’t have a high level of comfort with them) to still have access to the internet and stay in touch with family. At a focus group in Carlsbad, California, the firm FrontPorch installed smart speakers at an assisted living facility and got great enthusiasm from many of the residents, who were able to send messages to their family members, play games, listen to their favorite music, and even message each other. As one participant put it:

Most fun of all was setting up messaging with two friends who have also started using this magical device. Yes, we could wait ’til we saw each other in the lobby. Yes, we could use the telephone. But there’s something so personal and private and fun about using [this]. I haven’t had this much fun since we were kids and strung a wire between two tin cans and played “telephone”.

VUIs can also be a way to overcome low literacy rates. In The Wall Street Journal article The End of Typing: The Next Billion Mobile Users Will Rely on Video and Voice, the author writes about a man who owns a smartphone but, due to his poor literacy, is unable to use many of its features. With voice, he can now access important information like train schedules, as well as do things like play his favorite songs. Voice can also build confidence with using the Internet by allowing people to explore in a comfortable way.

The other side of this technology involves helping those who have either lost the ability to speak, or whose speech pattern is non-normative. For these folks, VUIs cannot always be used, as the speech models are trained on more standard speech. There are two approaches to solving these issues. First, for those who are losing the ability to speak due to diseases such as ALS,their voice can be recorded and then turned into a TTS (text-to-speech) voice. When they can no longer speak, they can use the TTS voice to communicate. Regarding the second problem, more than 100 million people in the U.S. and Europe have speech patterns that may not work with today’s VUIs — perhaps because they have a stutter, or they’ve had a stroke and their language is less intelligible. The solution is to create more speech models that can handle these differences.

The benefits of voice technology, however, and inclusive designs, make sense for everyone. Many of us experience temporary impairments. For example, someone carrying groceries (or a baby), who is temporarily unable to use their hands, or whose speech is impaired during a heavy cold or flu, can be aided by voice-enabled AI. Even something as simple as forgetting your reading glasses can mean you can’t read your phone; being able to use voice in these cases can really make a difference.

While Voice AI has the potential to help improve products and services for groups that may be under-represented, we also need to think about the speech recognition models themselves. There are people for whom the standard recognition models do not perform as well, and our goal is to ensure our training data covers a diverse population of users so we can improve the quality of our speech recognition for everyone. One of the examples we highlighted at the recent Google I/O conference shows how we’re using AI to improve products for people with a speech impairment. I strongly encourage businesses to consider how inclusive design benefits many of their customers. Voice technology truly can help all of us.

Cathy Pearl is head of conversation design outreach at Google, and the author of the O’Reilly book Designing Voice User Interfaces. She’s been creating Voice User Interfaces for 20 years and has worked on everything from programming NASA helicopter pilot simulators to a conversational app in which Esquire’s style columnist advises what to wear on a first date.