In this week’s Design Podcast episode, I sit down with Tanya Kraljic, UX manager and principal designer at Nuance Communications. Kraljic recently spoke at OReilly’s inaugural Design Conference (you can find the complete video compilation of the event here). In this episode, we talk about the challenges of moving from graphical to voice interfaces, the voice tools ecosystem, and where she finds inspiration.

We’re seeing a renewed emphasis on design at Nuance—actually, much like in the technology industry as a whole. We’ve always had great engineers who are building this very complex, very cutting-edge technology. Now, we’re augmenting that with a human-centered approach to product strategy and development, which I think we’re already seeing as accelerating innovation in our own company and, hopefully, it will also help create better and more usable solutions as voice becomes available in all these different technologies.

How intimately we talk to our stuff depends on what it’s done for us lately.

In the first post in this series, I mentioned that we’re getting used to talking to technology. We talk to our cell phones, our cars; some of us talk to our TVs, and a lot of us talk to customer support systems. The field has yet to settle into a state of equilibrium, but I thought I would take a stab at defining some categories of conversational interfaces.

There is, of course, quite a range of intelligent assistants, but I want to consider specifically different types of conversational interactions with technology. You might have an intelligent agent that can arrange meetings, for example, figuring out attendees’ availability, and even sending meeting requests. Certainly, that’s a useful and intelligent agent, but working with it doesn’t necessarily require any conversational interaction.

Classifying conversational interfaces

As usual with these kinds of things, the boundaries can be fuzzy. So, a particular piece of technology can have aspects of multiple categories, but here’s what I propose.

Voice interfaces: Understand a few set phrases

The most basic level of speech interactions are simple voice interfaces that let you control devices or software by speaking commands. Generally, these systems have a fixed set of actions. Saying a word or phrase is akin to using a menu system, but instead of clicking on menu items, you can speak them. You find these in cars with voice commands and Bluetooth interfaces to make phone calls or play music. It’s the same kind of system when you call into a phone tree that routes you to a particular department or person. Some of these systems allow for variations in how you say something, but for the most part, they will only understand words or phrases from a predefined list.