Last year around this time, Google updated Chrome, adding a unique feature to the company’s web browser—Speech Recognition. Six months later, Tal Atar, a SME in this field, discovered what he considered a serious breach of security in the Chrome web browser, and the culprit—speech-recognition.

How Chrome’s speech recognition works

Google created a speech-recognition Application Programming Interface (API) that informs developers building websites how to interact with Google Chrome and the computer’s microphone. The whole purpose is to give visitors to the website the ability to control their experience using voice commands, rather than having to type or click.

What makes the feature interesting is that Google transcribes the voice command into text. After transcription, Chrome sends the text to the website; where the web server deciphers the command, then executes it. Visiting this link will demonstrate the speech-recognition API.

Ater’s contention

When visitors first arrive at a speech-recognition enabled website, they are offered a choice, interface with the website normally, or give the website permission to use the microphone.

There should be an indication similar to the slide seen above, notifying that the microphone is active. Ater’s security concern centers on how the web site can enable the microphone without advertising that it is active. One example was what he called a pop-under window:

“When you click the button to start or stop the speech recognition on the site, what you won’t notice is that the site may have also opened another hidden pop-under window. This window can wait until the main site is closed, and then start listening in without asking for permission. This can be done in a window that you never saw, never interacted with, and probably didn’t even know was there.”

This may be a bit difficult to visualize. To clarify the process, Ater created a YouTube video showing how the pop-under window works.

Bottom line, if Ater’s contention is valid, putting Chrome’s speech-recognition API in the hands of an ill-intentioned website developer could turn a remote computer’s Chrome web browser and built-in microphone into a listening device.

How the listening device works

Let’s say a bad guy created a malicious website that uses speech recognition. Upon viewing, the malicious website appears to be an exact duplicate of someone’s favorite website. That user receives an email saying there is a gift waiting for him at his favorite website, just click the link. Unknown to this person, it’s a phishing email, and the link sends that person to the malicious website instead. That person is asked to try the new speech recognition feature. They say yes.

According to Ater, this computer is now a remote listening device. The malicious site will be able to monitor everything within range of the microphone, whether the user knows it or not.

Google or Ater, who is right?

Ater first reported his findings privately to Google in September 2013. Ater said Google engineers had a fix within weeks. Then a week ago, with no evidence of Google removing the bug from Chrome, Ater decided to go public:

“As of today, almost four months after learning about this issue, Google is still waiting for the standards group to agree on the best course of action, and your browser is still vulnerable.”

“[T]he web’s standards organization, the W3C, has already defined the correct behavior which would’ve prevented this… This was done in their specification for the Web Speech API, back in October 2012.”

Options to prevent eavesdropping

I want to reiterate, for speech recognition to work, the visitor must initially give the website permission to use the computer’s microphone. If permission is not given, the exploit falls apart.

There are ways to prevent eavesdropping for those who want to use speech recognition. There are also ways to disable speech recognition completely. For example:

The default setting in Chrome is “Ask if a microphone requires access” (see slide below). One option is to trust that Chrome asking for permission, plus some kind of indication that the microphone is on will be enough security.

Users who visit sites that use speech recognition and want to use it, but do not trust the software indicator have the ability to toggle the microphone on and off as shown below.

Users who are concerned about eavesdropping more than using speech recognition can click on the setting circled in red (as seen below) and leave it.

One problem: all of the above options are software based. There is no hard-wired switch to shut the on-board microphone off. For those concerned about this, there are two additional options:

Visit the Web Speech API demonstration website I mentioned earlier. If the microphone is off, you will get verification similar to the slide below.

For those who want to be absolutely sure, physically disable the on-board microphone, and when a microphone is required, plug an auxiliary microphone into the appropriate socket.