Live Transcribe Open Sourced

Recently, Google open-sourced its Live Transcribe's Speech Engine. Live Transcribe is an application that offers real-time automated captions for people who are deaf or hard of hearing.

By sharing the transcription engine with the world, Google wants to enable developers everywhere to build applications with robust transcription.

Transcribe's speech recognition is being offered by Google's Cloud Speech API, which, according to the company, under most conditions delivers pretty impressive transcript accuracy. But relying on the cloud had its own complications.

Source: Google

Through experimenting with the Opus audio codec, the company said that it has achieved data rates many times lower than most music streaming services. And at the same time, it also preserves the important details of the audio signal. Not only this, now with the custom Opus encoder, latency is now visually indistinguishable to sending uncompressed audio.

Google has also taken successful means to close and restart streaming requests prior to hitting the timeout. This includes restarting the session during long periods of silence and closing if there is a detected pause in the speech.