Experimenting with Webkit form speech input

Published: Dec 22, 2010 Read time: 2 mins

It’s been a few weeks now since I attended a Web Gaming event hosted by Mozilla and I’ve not yet had chance to write about one of the coolest features that was demonstrated: Speech input in forms. Now I must admit, I had no idea this was even possible; so it was quite a surprise when I saw the demonstration. With a few extra attributes added to a standard input box you can enable speech input:

It really is that simple! You may notice the “x-webkit-speech” attribute so will have guessed it currently only works in Webkit; and as far as I’m aware only in very recent versions of Google Chrome (note: “The Input Speech API will now be available by default on Chrome’s dev-channel”). The Speech API seems very new and is still a draft, so it will be a while before it’s fully adopted by other browser vendors.

A quick demonstration of the Speech API in action.

To demonstrate the speech input I’ve put together a very simple festivedemo that queries the Flickr API looking for images related to a value entered. So if you have a microphone and the dev channel of Chrome, why not give it a try. Chrome records the input of the microphone before sending it off to an external server to be processed, where a value is then sent back to the browser. At first I thought it was all processed inside the browser but I guess that’s a little to much to expect! It isn’t 100% accurate and the demo accepts Googles first guess so you may get a few strange results. The server returns multiple guesses which a user can select from, but this hasn’t been added (to my demo at least).

Since the API is very new it took a little trial and error to work out how to detect if the Speech API was available in the browser; but with a little help from Paul Irish I managed to crack it:

This function will return true if the API is available. It’s worth noting that since it’s very new this function is likely to break in the future as the properties always seem to be changing. It works at the moment, so get thinking about what this brilliant new API can be used for! It’s a big plus for accessibility if used correctly and I can also see some cool applications in web gaming too. View the demo.