All posts tagged ‘Speech Input API’

The Google Chrome team has pushed out a new beta release of the Chrome web browser, which adds support for the nascent Speech Input API. Yes, now you can talk to the web, it just might not exactly understand what you’re saying.

The Speech Input API is designed to give developers a way to write web apps that allow full speech recognition — the transcription from speech-to-text occurs on a speech server after your voice is recorded.

Chrome 11 beta is currently the only browser that supports the brand new Speech Input API and in my testing the results were mixed. So long as you raise your voice the app generally gets things right, though “Webmonkey” was interpreted as, ahem, “wet monkey.”

It’s worth noting that I did my testing using a built-in mic on my MacBook Pro, which is perhaps not the best sound source, especially since others seem to have had better luck. But, like most software that uses voice input, clearly the transcription in Google’s sample app is far from perfect.

However, as the Speech Input API gains more support it will open an entirely new set of possibilities for web apps, enabling everything from online speech-to-text services, realtime video transcriptions, voice chat logs or song lyric generators. Voice input could be particularly helpful on mobile devices and would go a long way toward making web-based apps as compelling as native apps. Voice input also opens up a whole new range of possibilities in creating a more accessible web — fill in forms via speech, browse by voice and so on. Not all of these features are specifically addressed in the new API or Google’s demo, but it’s not hard to imagine creative developers finding a way to make them possible.

Unfortunately, based on this early, very experimental example of the Speech Input API it’s going to be a while before you’re talking your way around the web.