Hackers Can Use Ultrasounds to Take Control of Alexa, Siri, Cortana, Others

Six scientists from Zhejiang University in China have discovered that they could use ultrasound frequencies — inaudible to human ears — to send commands to speech recognition software and take over devices such as smartphones, smart home assistants, or even cars.

Researchers named their experiment DolphinAttack because the attack scenario was inspired by how dolphins sometimes communicate with each other using inaudible sounds.

Seven popular speech recognition products affected

According to the research team, an attacker can take normal voice commands, convert the signal to ultrasound frequencies and use a cheap $3 device made of off-the-shelf electronic equipment to send the commands to nearby electronic devices that run voice assistant software.

The research team says it successfully tested their attack on seven popular speech recognition software products such as Alexa, Cortana, Google Now, Huawei HiVoice, Samsung S Voice, and Siri.

They also tested the attack on 16 platforms/devices that use this software, such as smartphones, computers, smart home assistants, and even the voice assistant installed on some Audi smart car models.

Attackers can have near full control over devices

“Tested attacks include launching Facetime on iPhones, playing music on an Amazon Echo and manipulating the navigation system in an Audi automobile,” researchers said. Below is a video of a DolphinAttack.

According to researchers, other more intrusive attacks can also be carried out, such as instructing the user’s browser to visit malicious websites, install malicious apps, subscribe users to premium numbers, launch phone calls and listen on user conversations, and more.

Researchers say their portable attack rig can transmit signals at frequencies of 23 kHz, 25 kHz, 33 kHz, 40 kHz, and 48 kHz. The attack rig can work to distances of up to 1.75 meters (5.75 feet).

The language in which voice commands and background noise might affect the attack’s efficiency and the maximum distance it can be used in a real-life scenario.

No incentive to deploy mitigations

The research team recommends that speech recognition software makers add an upper limit to the frequencies they listen and patch their software to ignore everything that has a frequency of above 20 kHz.

In practice, this recommendation might be ignored, as some of these software makers are also involved in online advertising and might be interested in using ultrasonic beacons to track users.

More technical details about DolphinAttack are available in a research paper entitled “DolphinAttack: Inaudible Voice Commands” that researchers will present at the ACM Conference on Computer and Communications Security conference that will take place in Dallas, USA, in late October.