Microsoft Research’s SoundWave: Gesture control using sound alone

[Image: Sound waving: SoundWave uses the Doppler effect and the microphone and speakers in your computer to sense and interpret gestures. Microsoft Research.]

Gesture Control System Uses Sound Alone

SoundWave lets an ordinary laptop function like a Kinect sensor.

Monday, May 7, 2012
By Rachel Metz

When you learned about the Doppler Effect in high school physics class—the wave frequency shift that occurs when the source of the wave is moving, easily illustrated by a passing ambulance—you probably didn’t envision it helping control your computer one day.

But that’s exactly what a group of researchers are doing at Microsoft Research, the software giant’s Redmond, Washington-based lab. Gesture control is becoming increasingly common and is even built into some TVs. While other motion-sensing technologies such as Microsoft’s own Kinect device use cameras to sense and interpret movement and gestures, SoundWave does this using only sound—thanks to the Doppler Effect, some clever software, and the built-in speakers and microphone on a laptop.

Desney Tan, a Microsoft Research principal researcher and member of the SoundWave team, says the technology can already be used to sense a number of simple gestures, and with smart phones and laptops starting to include multiple speakers and microphones, the technology could become even more sensitive. SoundWave—a collaboration between Microsoft Research and the University of Washington—will be presented this week in a paper at the 2012 ACM SIGCHI Conference on Human Factors in Computing in Austin, Texas.

The idea for SoundWave emerged last summer, when Desney and others were working on a project that involved the use of ultrasonic transducers to create haptic effects, and one researcher noticed a sound wave changing in a surprising way as he moved his body around. The transducers were emitting an ultrasonic sound wave that was bouncing off researchers’ bodies, and their movements changed the tone of the sound that was picked up, and the sound wave they viewed on the back end.

The researchers quickly determined that this could be useful for gesture sensing. And since many devices already have microphones and speakers embedded, they experimented to see if they could use those existing sensors to detect movements. Tan says standard computer speakers and microphones can operate in the ultrasonic band—beyond what humans can hear—which means all SoundWave has to do to make its technology work on your laptop or smart phone is load it up with SoundWave software.

Chris Harrison, a graduate student at Carnegie Mellon University who studies sensing for user interfaces, calls SoundWave’s ability to operate with existing hardware and a software update “a huge win.”

“I think it has some interesting potential,” he says.

The speakers on a computer equipped with SoundWave software emit a constant ultrasonic tone of between 20 and 22 kilohertz. If nothing in the immediate environment is moving, the tone the computer’s microphone hears should also be constant. But if something is moving toward the computer, that tone will shift to a higher frequency. If it’s moving away, the tone will shift to a lower frequency.

This happens in predictable patterns, Tan says, so the frequencies can be analyzed to determine how big the moving object is, how fast it’s moving, and the direction it’s going. Based on all that, SoundWave can infer gestures.

The software’s accuracy hovers in the 90 percent range, Tan says, and there isn’t a noticeable delay between when a user makes a gesture and the computer’s response. And SoundWave can operate while you’re using the speakers for other things, too.

So far, the SoundWave team has come up with a range of movements that its software can understand, including swiping your hand up or down, moving it toward or away from your body, flexing your limbs, or moving your entire body closer to or farther away from the computer. With these gestures, researchers are able to scroll through pages on a computer screen and control simple Web navigation. Sensing when a user approaches a computer or walks away from it could be used to automatically wake it up or put it to sleep, Tan says.

Harrison thinks that having a limited number of gestures is fine, especially since users will have to memorize them. The SoundWave team has also used its technology to control a game of Tetris, which, aside from being fun, provided a good test of the system’s accuracy and speed.

Tan envisions SoundWave working alongside other gesture-sensing technologies, saying that while it doesn’t face the lighting issues that vision-based technologies do, it’s not as good at sensing small gestures like a pinch of the fingers. “Ideally there are lots of sensors around the world, and the user doesn’t know or care what the sensors are, they’re just interacting with their tasks,” he says.