3-D gesture control breaks out of the game box

This could be the year 3-D gesture recognition proves it’s not just child’s play. Several years after its first consumer market appearance in the wireless gaming interface for Nintendo’s Wii, MEMS sensor-based gesture recognition is extending its reach to smartphones and is set to take hold of that most iconic of consumer interfaces: the TV remote.

Since the Wii’s 2006 release, Nintendo’s competitors have spun their own versions of 3-D gesture recognition and processing. Sony tuned the Move Playstation controller for hard-core gamers seeking pinpoint accuracy; Microsoft took the gaming interface hands-free with the Xbox Kinect.

Apple was the first to pick up on microelectromechanical sensors’ potential for building more intuitive smartphone interfaces; it added MEMS accelerometers to the iPhone in 2007 and a MEMS gyroscope in 2010. Its competitors have followed suit, and soon 3-D commands such as shake-to-undo, lift-to-answer and face-down-to-disconnect will be standard smartphone fare.

Today, consumer OEMs are adding 3-D gesture recognition across their product lines. Some are using camera-based techniques licensed from GestureTek Inc. (Sunnyvale, Calif.); others have licensed MEMS approaches from Hillcrest Laboratories Inc. (Rockville, Md.) or Movea Inc. (Grenoble, France). Movea holds more than 250 related patents, covering such techniques as the use of a gyroscope to control cursors; Hillcrest holds more than 100, including a patent on the use of an accelerometer with a gyroscope for tracking motion. Both companies also offer value-added software development tools for 3-D gesture designers (Movea’s Gesture Builder and Hillcrest’s Freespace MotionStudio).

Google, for its part, has added MEMS-based gesture recognition application programming interfaces to the Gingerbread release of the Android OS, which recognizes such gestures as tilt, spin, thrust and slice.

“Motion processing has finally been accepted by the mainstream,” said Steve Nasiri, founder of InvenSense Inc. (Sunnyvale), the first MEMS chip maker to combine an accelerometer and gyroscope on one die. “We predict that the hardware for motion processing and gesture recognition will become as ubiquitous in smartphones as the camera module.”

InvenSense’s gyroscopes and accelerometer/gyroscope combo chips embed a motion processor to execute the complex sensor fusion algorithms necessary to recognize a user’s gestures, offloading the task from the application processor. The company plans to combine an accelerometer, gyroscope and magnetometer (e-compass) on a single die by next year.

The InvenSense Motion Processing Library turned up at the International Consumer Electronics Show in both the first television remote control to harness 3-D gesture recognition and the first smartphone to apply it for primary phone functions (such as answering the phone merely by lifting it to the ear). Both are LG Electronics products. The Magic Motion remote is used with LG’s Infinia line of 3-D TVs. LG’s 9.2-millimeter-wide Optimus Black smartphone, touted as slimmest available smartphone, recognizes several unique gesture-based commands.

Other MEMS chip makers have likewise incorporated gesture recognition algorithms into their accelerometers and gyroscopes. Kionix Inc. (Ithaca, N.Y.), for example, offers dozens of models with built-in gesture-recognition algorithms, and its Gesture Designer software development suite lets OEMs design their own gesture-based controls.

“The TV remote-control guys are making a big push to bring gesture recognition into its own, requiring very sophisticated use of motion,” said Kionix CEO Greg Galvin. “The holy grail here is the convergence of audiovisual input into the TV, allowing you to change channels, download music, look at your library of photos, do texting or surf the Internet, all with a single controller.”

Running apps and navigating Web content on an IPTV require a remote control that is capable of mouse-like accuracy for both point-and-click and gesture-based control. “For these apps, you need MEMS,” Galvin said.

Apple users are already familiar with 2-D gesture control, and commands such as pinch-to-zoom have been copied by Apple's computing and smartphone rivals.Click on image to enlarge.

I think this research will not only help the gaming industry but real time training sessions like for learning driving, learning sports, learning dance etc. This research will bring down the cost of simulators.

Very interesting article. I think gesture recognition as an input method is optimal because it is basically the gadgets adapting to the human and not the other way around.
Once a user has experienced the ease of controlling with his hand by using gestures it's hard to go back to mouses and keyboards.
Of course, there will always be places where a touch screen isn't viable or even needed so keyboards, buttons and mouses will not disappear.

easy to solve... the point cloud provided by 3D sensors can easily be processed to figure out which user is in control even if they overlap position. And there can be a simple gesture to 'take control' of the remote. At the end of the day however, you can't stop people from fighting for the remote even today ;)

Gaming solutions I suspect will evolve into hand held controllers, but perhaps more life-like and prop like, with buttons for low latency event tracking, in parallel with 3D environmental sensors. The 3D sensor will get on a Moore's law type of improvement path and get you XGA resolution of the scene with less than 1mm precision, in 60fps, etc... Kinect is a coarse (but brilliant) hack compared to what we will see in 3 to 5 years in this space.

The cool thing about using the gesture recognition APIs built into iOX and Android, is that the devices already have the hardware--accelerometer, gyroscope, magnetometer and barometer. Those MEMS sensors can perform all the location-based functions for which they were intended, as well as giving app writers access to 3-D gestures for free!

Gestures are great when single user interfaces with machine; what happens when three or four users use the same machine and try to overpower others with firmer gestures to change channels, increase volume, etc...It's a fun UI, but useful?

During the mid-term election, I noticed in the CNN's "John King" show, they had a 3-D interactive statistic graphs displayed literally in the thin air and he can control the graphs using his gesture, I thought that was cool, now I know how they did it.