We present a novel machine learning based algorithm extending the interaction
space around mobile devices. The technique uses only the RGB camera now
commonplace on off-the-shelf mobile devices. Our algorithm robustly recognizes
a wide range of in-air gestures, supporting user variation, and varying
lighting conditions. We demonstrate that our algorithm runs in real-time on
unmodified mobile devices, including resource-constrained smartphones and
smartwatches. Our goal is not to replace the touchscreen as primary input
device, but rather to augment and enrich the existing interaction vocabulary
using gestures. While touch input works well for many scenarios, we
demonstrate numerous interaction tasks such as mode switches, application and
task management, menu selection and certain types of navigation, where such
input can be either complemented or better served by in-air gestures. This
removes screen real-estate issues on small touchscreens, and allows input to
be expanded to the 3D space around the device. We present results for
recognition accuracy (93% test and 98% train), impact of memory footprint and
other model parameters. Finally, we report results from preliminary user
evaluations, discuss advantages and limitations and conclude with directions
for future work.