Teaching Alexa when not to respond

Real-time acoustic fingerprinting will prevent Alexa devices from waking when her name is called during our commercial in the big game.

By Day One Staff

on February 2, 2018

If history repeats itself, more than 100 million people around the globe will gather this Sunday to watch The New England Patriots and Philadelphia Eagles compete in Super Bowl LII. One of the “games within a game” is the contest for who will win the competition for best Super Bowl commercial.

Three seconds into the company’s commercial a woman in her bathroom asks: “Alexa, what’s the weather like today?”

When the 90-second advertisement airs during the game Sunday evening, millions of Echo devices won’t be unintentionally waking up to the Alexa phrase. This is possible because of acoustic fingerprinting technology that can distinguish between the ad and actual customer utterances.

“The trick is to suppress the unintentional waking of a device while not incorrectly rejecting the millions of people engaging with Alexa every day,” said Shiv Vitaladevuni, a senior manager on the Alexa Machine Learning team in Cambridge, Massachusetts.

Our advertising, engineering, and science teams are able to anticipate major events like the Super Bowl, but what happens when someone like Tonight Show host Jimmy Fallon does a comedy routine about Alexa, which the team couldn’t anticipate?

Manoj Sindhwani, director for Speech Recognition, explains that our teams build acoustic fingerprints on-the-fly within our AWS cloud. When multiple devices start waking up simultaneously from a broadcast event, similar audio is streaming to Alexa’s cloud services. An algorithm within Amazon’s cloud detects matching audio from distinct devices and prevents additional devices from responding. The dynamic fingerprinting isn’t perfect, but as many as 80 to 90 percent of devices won’t respond to these broadcasts thanks to the dynamic creation of the fingerprints.