Articles about design, invention, the future of the web, Firefox, and startups.

Big & Important

The big ones. Concepts to change the world. If you read only one thing...

Mischief

Projects and nefariousness. Don't tell your mother.

Design, &c

Sketches, photos, designs and other beautiful things.

Shop

Bloxes, bags, and watches. Design goods.

Down With Audio Interfaces

I often get asked about the future of interfaces: “Wouldn’t it be great”, people say, “if we could just talk to our computers like in Star Trek? Aren’t voice recognition and talking computers the interface of the future?” A lot of people seem to think that all interface problems can be solved via voice. But I have a one word answer: Voicemail.
Everyone hates voicemail and voicemail systems. And with good reason. These days voicemail is getting pretty “smart” : you can now say “Yes” and “No” instead of pressing 1 or 2 in response to questions (unless you have an accent, in which case don’t bother). You can even say a person’s name to be connected to a someone else’s extension. But technical problems aside, these are patches on a fatally flawed medium.

Audio interfaces will always lack something that visual interfaces posess effortlessly: the ability to jump around at will. If you don’t care about the information in a paragraph, you skip to the next one. You don’t have to inform the piece of paper you are reading that you want to navigate, you just do it. You look here, then there. You scan. You find what’s interesting. Visual interfaces excel because they let you throw away the unneeded information-chaff and focus on what you want to know. There is no analog in the audible world. When you’re listening, it takes substantially longer to know where you are and what you’re listening to. When using an audio interface, you are forced to be linear: a word follows the word that came before it and precedes the word after it. There is no way to get to the last word without hearing the two words before it. There’s no getting around it. It sucks.

An example: imagine you are using a conventional voicemail implementation on a computer with a standard display. There would be a button for skipping, a button for replaying, a button for saving, a button for deleting, a button for hearing the time and date the message was left. In short, all of the normal voicemail actions. If the designer was ambitious, they could even have include a widget to let you scrub through the message. When you want to delete a message, you’d roam the interface with your eyes, reading each button label in a fraction of a second, find the delete button, and click it. Simple and quick. The important point is that you are effortlessly flitting your eyes past all of the information you don’t want, to find the information you do want. In fact, a very common phone system—the cell phone—has a display too, and if its voice mail interface had some hint of humanity to it, it would at least show a visual menu telling you what button performed what action.

But instead, you have to wait for the voicemail system to tell you what number to press to delete the message. Yet before it tells you how to delete the message, it will first tell you how to replay the message, skip the message, move to the previous message, save the message for later, and perhaps force you to listen to an advertisment from your service provider. And because it’s audio and linear, you can’t skip any of it. The more complex voicemail gets, the longer you’ll have to wait. With an audio interface, you have no way of moving past the information you don’t want.

The reason why the Candorville cartoon shown at the top of this post is funny is because it illustrates a lose-lose situation in voicemail: if you have instructions read to you before every message, listening to your voicemail takes forever; if you don’t have the instructions read at all, you’ll never know what to do. It’s a Catch-22. And that’s the crux of the problem: there doesn’t exist a good way of providing instructions in purely audio interfaces. Sometimes a balance can be struck, but it will always be the best of a bad set of solutions. It will never even be good.

I know that I’ve made egregious mistakes because I didn’t want to wait for the instructions and I thought I remembered what button to push. But, do I press 7 to save a message and 9 to delete it? Or is it 9 to save and 7 to delete? Naturally, I remembered incorrectly. The moral of the story is that voice-based interfaces can cost you a date.

View all 98 comments

Pgan

That’s not fair. “Talk to our computers like in Star Trek” means to me that you would say “Delete this message” or “Computer, when is my appointment with Sara?”. The computer would understand spoken commands in context and summarize relevant information. Then the problems you point out do not arise. Talking and listening, like looking, are natural human activities.

We are at least 100 years away from such technology, but we are making progress. Please do not discredit audio interfaces in general.

Oh, I should add, that a good audio interface is more fitting than a visual one in some situations, for example when asking for directions while driving, or asking your car to auto-drive you the next whiskey bar.

I agree with Pgan on this one, and let me explain why. Audio interfaces as they are now are only useful in certain situations, but mainly in ubiquitous computing (make a computer do something for you anywhere you are, such as in your house, with “turn off all the lights and appliances”, or to your PDA with “show me the nearest pub”, and so on).

Audio interfaces excel not as systems for general computing but at doing very specific things very quickly–no menus or virtual paths, just straight-to-the-point tasks. In fact, it’s my personal belief that the apex of interfaces will be an AI that understands every word you say and does it immediately–everything else (sifting through websites, browsing virtual galleries, etc) will be done just like they are today, only because they’re virtual (using augmented reality), we can move through the information/content much faster and in a natural manner.

Oh man, someone was telling about some research or developments going on in audio interface design right now. This research involved the acronym TTH — Time To Human — to talk about how easy or frustrating audio systems were. Apparently T-Mobile’s customer service has the lowest (quickest) TTH out of any major cell phone company. I think I’ve never actually been frustrated with T-Mobile’s customer service or anything. They’ve been surprisingly easy to deal with.

Audio interfaces seem to be inherently linear… until you realize that human-to-human conversation can be purely audio, yet be nonlinear and interactive. Perhaps this is what AI will be able to mimic some day? In a conversation, you can interrupt the person and say “but what about xyz? that’s what I want to know” or whatever, analogous to how you would skim a visual text. If audio interfaces became good enough, they would allow for this kind of interruption and be able to process your requests with greater understanding than dudeguybot or smarterchild.

Although I think you’re really on to something with this bit–
“and if its voice mail interface had some hint of humanity to it, it would at least show a visual menu telling you what button performed what action.”

This works for voice mail, but not necessarily for every audio interface. Unless this voice mail visual menu would be sent to your phone as a whole other type of information (if not, the menu would have to come with the phone, or be installed on it or whatever). In that case, you’d have an entirely different type of cell phone technology on your hands (I think?), and every call you make could send you things like menus and all other kinds of multimedia, and this will probably happen when blackberry-like devices replace cell phones.

And I just read your last sentence, ouch… was that the catalyst of making this post?

Thanks for all of the insightful comments. I think wasn’t as clear as should have been.

I do not mean to say the audio interfaces form a bad method of input–at this they excel, especially in specific domains where visual input is cumbersome or dangerous, such as in the car or on devices too small for keyboards. But they will always lack a benefit that visual interfaces give for output–audio is fundamentally linear.

I’d argue that the ability to interrupt an audio interface to ask for new information does not mean that audio output is not linear. In visual output, the same organ that perceives the information can change what information it is perceiving. That is, the eye plays an active role in information processing. The same cannot be said for audio output. In order to change what is being heard, one must first think about what one wants to be hearing, form it into a sentence, and say it (with possible modifications and corrections). That is, the ear plays a passive roll in information processing.

Now, this linearity can be compensated for: if voice recognition were perfect and computers could flawlessly pass the Turing test (both challenges are many scores, if not more, years away) then indeed one could interrupt the computer asking for new pieces of information as in human-to-human communication. But, for now and in the near future, the natural language processing required for such interaction is truly science fiction. Finally, remember that we can speak faster than we type, but we can read faster than we speak.

And, as a side note, humans augment their communication with gesticulations, intonation, and eye-brow squiggles. Anyone who has been stymied trying to speak a foreign language over the phone can attest to just how important those non-audio factors are to understanding.

Needless to say, audio interfaces also lack spacial information. You can’t design and you can’t plan with an audio interface. So there will always be a place for visual (even in Star Trek).
Current audio interfaces are so fallable that in order to use them effectivly you need some form of feedback. As you are currently talking to the interface, that feedback really has to be visual to not be distracting. Voicemail systems and the likes, while presumably necessary, are not really the ideal place to develop audio interfaces. Currently, the place to develop them are on desktop machines and the likes, that can provide visual feedback.
I’d be interested to know what the current speech-to-text capabilities are for phone-line quality conversations. I would love for all my messages to be displayed in an email-style fashion with summaries I can skip through, and then listen to each as I please. But I would be willing to gamble that too is a little way off.

I like voicemail. I like it, and I don’t make more mistakes with its menus than I do with visual menus. 7 is delete, 9 is save, 1 is hear messages. Hitting 7 twice while listening ends the message and deletes it, hitting 3 jumps ahead a few seconds in the message. Sure I’ve accidentally deleted a message I wanted to keep–I’ve also accidentally clicked “Okay” on a pop-up dialogue I meant to cancel. A lot of the ideas generated here sound cool, but I’m not really with you on the original post. Everyone doesn’t hate voicemail and voicemail systems. I for one am a living exception.

I have a desktop interface to voicemail that’s integrated into my email client. It has the visual control interface (play, skip forward, backward, delete, …) and unread messages are highlighted like new email messages. This completely eliminates login and listening to instructions so that you get straight to listening to the messages.

An indispensible feature is speed adjustment. You can speed up (or slow down) a message which saves time.

Deleted messages go into a trash can enabling them to be restored if desired.

It’s tedious using the voicemail system directly and rarely have to anymore!

Also, audio can be quite good for output, for things that are fundamentally linear, such as music and listening to a speech. Of course, combined with a flexible audio input would help for other tasks, such as booking airline reservations where the computer would ask you where you want to go, what time, etc. over the phone. I read a book about one implemented at Stanford or something like that.

Call me old fashioned, (hum rest of Bob Seger tune here if you must) – but I will NEVER get used to a machine that speaks in first person and tries to pretend that it is a sentient being and wants me to play along!

Note that this post is actually remarkable sweet theme. I harmonize conclusions and will eagerly expect incoming updates. Saying thanks can not just sufficient, for the wonderful clarity in your writing. I will immediately grab your rss feed to stay informed of any updates. Exemplary job and much success in your business! Please forgive my poor English as it is not my first language.

You do not intend to do so, but I think it has managed to express the state of mind that a lot of people entering Taste want to help, but not knowing how or where, is something a lot of us are going through.

Great post! ? I started out in the media community management marketing and trying to learn how to do it well resources like this article very helpful. As our company is based in the U. S. , it? S all a bit new to us. The example above is something that worries me as well, how to show your enthusiasm and share the fact that your product is useful in this regard

Thank you for taking the time to discuss this, I feel strongly about and want to learn more about this topic. If possible, as you gain experience, would you mind updating your blog take with more information? It is extremely useful for me.

Hi webmaster, commentators and others! Blog is absolutely fantastic! Plenty of information and inspiration, both of which we all need! B Keep em coming . you all do a great job at such concepts . I can not tell you how much I, for one appreciate all you do!

Have you ever considered adding video to your blog to keep the audience more entertained? I mean, I just read the whole article of your and it was quite good but since I m more of a visual learner, I found that to be more helpful well let me know how it turns out! I love what you guys are always up too. The clever work and reporting! Keep up the good work I have added you to my blogroll. This is a great article thanks for sharing this informative information . . I will visit your blog regularly for some latest post.

Wow, this is a post that is really good quality. In my theory, ODA to write like this too, time and real effort to make a good recovery after IA. but what can I say. Procrastinate a lot and never appear to get something done.

I admit, I have not been to the site in a long time. however it was another pleasure to look at it is, even professionals important topic and ignored by a lot like that. I thank you for helping to make people more aware of the possible issueExcellent things as typical.

This post is quite interesting. I really never thought I could have a good read by this time until I found this site. I am grateful also very well written given. your information. Thanks to both post. From ton comments on your articles, I guess I am not the only one having all the fun here! Keep up the good work.

I must say that I was impressed. Very rarely do I come across a blog that is both educational and entertaining touch. Just letting you know that you have most definatly hit the nail on the head. Your mind is ideal. Thx is all I can say.

Most powerful, just give it a colleague who was doing a little research this. And he actually bought me breakfast as a result of I found it for him . . smile. So let me rephrase that: Thnx for the deal with! But yeah Thnkx for spending the time to discuss this, I feel strongly about and want to learn more about this topic. If achievable, as you become experience, would you mind updating your blog with extra details? Which in turn is very useful for me. Big thumbs up for this blog put up!

An interesting dialogue is worth comment. I think you should write extra on this subject, it will not be a taboo subject but generally people are not enough to talk about these issues. To the next. Hail

This is a great resource that you are providing and you give it away for free. I enjoy seeing websites that understand the value of providing a major resource for free. I really loved reading your post. Thank you!

This is the perfect blog for anyone who wants to know about this topic. You know so much it s almost hard to argue with you (not that I really want . haha). You definitely put a new spin on a topic that has been written over the years. Great stuff, just great!

This is a smart blog. Really. You have so much knowledge about this issue, and so much passion. You also know how to make people rally behind it, obviously from the responses. It has a design here that s not too flashy, but makes as big as what you say statement. Great job, in fact.

This is my first time I visit here. I found so many entertaining stuff in your blog, especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the fun here! Keep up the good work.

Aw, this was a message that was really good. In theory I d like to write like this too taking time and real effort to make a good article . but what can I say . I procrastinate a lot and never seem to do something.

Let me start by saying beautiful message. Not sure if this has been discussed about, but when using Chrome I can never get the entire site to load without refreshing many times. May be my computer. Thank you.

Great stuff from you, man. I ve read your stuff before and you are too superb. I love what you ve got here, love what you say and how you say it. You make it entertaining and can still stay smart. I can not wait to read more from you. This is really a great blog.

Great post! I m just beginning in community management communication marketing and trying to learn how to do it well resources like this article useful. As our company is based in the U. S. , it? S all a bit new to us. The example above is something that worries me as well, how to show your own enthusiasm and share the fact that your product is useful in this regard

Hey, just looking around some blogs, seems a very good platform you are using. I m currently using WordPress for some of my sites but looking to change one of them similar to yours as a test platform. Anything in particular you would recommend about it?

With the whole thing that seems to be building within this subject matter, all your viewpoints are generally somewhat refreshing. Even so, I appologize, but I can not subscribe to your entire plan, all be it exhilarating none the less. It appears to everybody that your opinions are generally not completely justified and in actuality you are generally your self not really wholly confident of the argument. In any event I did appreciate examining it.

Although I am no noob in the website industry, your site really is something different and features some helpful thoughts. Enjoying it to the fullest! I ll incorporate you in my blogroll, i think it will provide more value to my visitors.

Hrmm that was weird, my comment feed. However, I would say it isa good to know that someone else also mentioned this as I had trouble finding the same information elsewhere. This was the first place that gave me the answer. Thank you.

Great post! ? I started out in the media community management a marketing and trying to learn how to do it well resources like this article very helpful. As our company is based in the U. S. , it? S all a bit new to us. The example above is something that worries me as well, how to show your enthusiasm and share the fact that your product is useful in this regard

It does not have? U can write better. Reading this post remiands me of my old roommate! He always talked about it. I will forward this article to him. Pretty sure he will have a good read. Thank you for sharing!

Great stuff from you, man. I ve read your stuff before and ayou re too awesome. I love what you ve got here, love what you say and how you say it. You make it enjoyable and you still can stay smart. I can not wait to read more from you. It is really a great blog.

Pretty component to content. I just stumbled upon your blog and in accession capital to claim that I get actually loved account
your blog posts. Anyway I’ll be subscribing for your feeds and even I
success you get entry to persistently quickly.

Hi there! This is my first go to to your blog! We are a team of volunteers and starting a new project in a community within the very same niche. Your weblog provided us beneficial information to function on. You’ve got done a extraordinary job!

I haven抰 checked in here for some time as I thought it was getting boring, but the last several posts are great quality so I guess I will add you back to my everyday bloglist. You deserve it my friend :)

Pretty nice post. I simply stumbled upon your weblog and wished to mention that I’ve really enjoyed browsing your weblog posts. After all I will be subscribing for your rss feed and I’m hoping you write once more very soon!

Great blog! Do you have any helpful hints for aspiring writers? I’m hoping to start my own blog soon but I’m a little lost on everything. Would you suggest starting with a free platform like WordPress or go for a paid option? There are so many choices out there that I’m totally confused .. Any suggestions? Appreciate it!

Your self should really from be a aspect of a contest for exactly one particular of the maximum ultimate high quality website sites on the internet. I’m transferring in the direction of recommend this site web site!

Called an interface guru by publications like Wired and Fast Company, Aza is the co-founder of Massive Health, and was until recently Creative Lead for Firefox. Previously, he was a founding member of Mozilla Labs. Aza gave his first talk on user interface at age 10 and got hooked. At 17, he was talking and consulting internationally. Aza has founded and sold two companies, including Songza.com, a minimalist music search engine that had over a million song plays in its first week. He also creates modular cardboard furniture called Bloxes. In another life, Aza has done Dark Matter research at both Tokyo University and the University of Chicago, from where he graduated with honors in math and physics