Interfaces on the go

We have continually evolved computing to not only be more
efficient, but also more accessible, more of the time (and
place), and to more people. We have progressed from batch
computing with punch cards, to interactive command line systems,
to mouse-based graphical user interfaces, and more recently to
mobile computing. Each of these paradigm shifts has drastically
changed the way we use technology for work and life, often in
unpredictable and profound ways.

With the latest move to mobile computing, we now carry devices
with significant computational power and capabilities on our
bodies. However, their small size typically leads to limited
interaction space (diminutive screens, buttons, and jog wheels)
and consequently diminishes their usability and functionality.
This presents a challenge and an opportunity for developing
interaction modalities that will open the door for novel uses of
computing.

Researchers have been exploring small device interaction
techniques that leverage every available part of the device. For
example, NanoTouch, developed by Patrick Baudisch and Gerry Chu
at Microsoft Research, utilizes the backside of devices so that
the fingers don't interfere with the display on the front
[2] (see also in this issue "My New PC is
a Mobile Phone," page 36). In more conceptual work, Ni and
Baudisch explore the advent of "disappearing mobile devices" (see
[7]).

Other researchers have proposed that devices should
opportunistically and temporally "steal" capabilities from the
environment, making creative use of existing surfaces already
around us [9]. One example of this type of
interaction is Scratch Input, developed by Chris Harrison and
Scott Hudson of Carnegie Mellon's HCI Institute. This technique
allows users to place devices on ordinary surfaces, like tables,
and then use them as ad hoc gestural finger input canvases. This
is achieved with a microphone on the underside that allows the
device to sense audio signals transmitted through the material,
like taps and scratches [4]. These types
of solutions work really well in situations where the user is
situated (in an office, airport, hotel room), but is impractical
when the user is on the go.

This mobile scenario is particularly challenging because of
the stringent physical and cognitive constraints of interacting
on-the-go. In fact, Antti Oulasvirta and colleagues showed that
users could attend to mobile interaction bursts in chunks of
about 4 to 6 seconds before having to refocus attentional
resources on their real-world activity (see
[8] for the full write up). At this point,
the dual task becomes cognitively taxing as users are constantly
interrupted by having to move focus back and forth. In a separate
line of work, Daniel Ashbrook of Georgia Institute of Technology
measured the overhead associated with mobile interactions and
found that just getting a phone out of the pocket or hip holster
takes about 4 seconds and initiating interaction with the device
takes another second or so [1]. They
propose the concept of micro-interactionsinteractions that
take less than 4 seconds to initiate and complete, so that the
user can quickly return to the task at hand. An example of this
type of interaction is Whack Gestures [6],
created by Carnegie Mellon and Intel Labs researchers, where
quite simply, you do things like whack the phone in your pocket
to silence an incoming phone call.

"Micro-interactions could significantly
expand the set of tasks we could perform on-the-go and
fundamentally alter the way we view mobile computing."

We believe that such micro-interactions could significantly
expand the set of tasks we could perform on-the-go and
fundamentally alter the way we view mobile computing. We assert
that while seemingly subtle, augmenting users with
always-available micro-interactions could have impact on the same
magnitude that mobile computing had on enabling a set of tasks
that were never before possible. After all, who would have
imagined mobile phonse would make the previously onerous task of
arranging to meet a group of friends for a movie a breeze? Who
would have imagined when mobile data access became prevalent that
we'd be able to price shop on-the-fly? Or resolve a bar debate on
sports statistics with a quick Wikipedia search? Imagine what we
could enable with seamless and even greater access to information
and computing power.

To realize this vision, we've been looking at ways to enable
micro-interactions. Often, this involves developing novel input
modalities that take advantage of the unique properties of the
human body. In this article, we describe two such technologies:
one that senses electrical muscle activity to infer finger
gestures, and the other that monitors bio-acoustic transmissions
through the body, allowing the skin to be turned into a
finger-tap-sensitive interaction surface. We conclude with some
of the challenges and lessons learned in our work using
physiological sensing for interaction.

Muscle-Computer Interfaces

Removing manipulation of physical transducers does not
necessarily preclude leveraging the full bandwidth available with
finger and hand gestures. To date, most efforts at enabling
implement-free interaction have focused on speech and computer
vision, both of which have made significant strides in recent
years, but remain prone to interference from environmental noise
and require that the user make motions or sounds that can be
sensed externally and cannot be easily concealed from people
around them.

Advances in muscular sensing and processing technologies
provide us with the unprecedented opportunity to interface
directly with human muscle activity in order to infer body
gestures. To contract a muscle, the brain sends an electrical
signal through the nervous system to motor neurons, which then
transmit electrical impulses to adjoining muscle fibers, causing
them to contract and the body to move. Electromyography (EMG)
senses this muscle activity by measuring the electrical potential
between ground and a sensor electrode.

In our work, we focus on a band of sensors placed on the upper
forearm that senses finger gestures on surfaces and in free space
(see Figures 1 and 2). We have
recently built a small, low-powered wireless prototype EMG unit
that uses dry electrodes and that can be placed in an armband
form factor, making it continuously wearable as an
always-available input device. The signals from this device are
streamed to a nearby computer, where features are extracted and
machine learning used to model and classify gestures. However,
this could also be done entirely on a mobile device.

Reasonably high accuracies can be achieved for gestures
performed on flat surfaces. In one experiment with 13 novice
users, we attained an average of 78 percent accuracy for sensing
whether each of two fingers is curled, 84 percent for which of
several pressure levels are being exerted on the surface, 78
percent for which of the five fingers have tapped the surface,
and 95 percent for which of the five have lifted off the
surface.

"With the latest move to mobile computing,
we now carry devices with significant computational power and
capabilities on our bodies."

Similarly, in a separate test with 12 different novice users,
we attain 79 percent classification accuracy for pinching the
thumb to fingers in free space, 85 percent when squeezing
different fingers on a coffee mug, and 88 percent when carrying a
bag. These results demonstrate the feasibility of detecting
finger gestures in multiple scenarios, and even when the hands
are otherwise occupied with other objects.

To further expand the range of sensing modalities for
always-available input systems, we developed Skinput (see
Figure 3), a novel input technique that allows
the skin to be used as a finger input surface. When a finger taps
the skin, several distinct forms of acoustic energy are produced
and transmitted through the body. We chose to focus on the arm,
although the technique could be applied elsewhere. This is an
attractive area to "steal" for input as it provides considerable
surface area for interaction, including a contiguous and flat
area for projection.

Using our prototype, we've conducted several experiments that
demonstrate high classification accuracies even with a large
number of tap locations. This remains true even when the sensing
armband was placed above the elbow (where taps are both separated
in distance and by numerous joints). For example, for a setup in
which we cared to distinguish between taps on each of the five
fingers, we attain an average accuracy of 88 percent across our
13 novice participants. If we spread the five locations out
across the whole arm, the average accuracy goes up to 95 percent.
The technique remains fairly accurate even when users are walking
or jogging. Although classification is not perfectnor will
it likely ever bewe believe the accuracy of our
proof-of-concept system clearly demonstrates that real-life
interfaces could be developed on top of the technique.

While our bio-acoustic input approach is not strictly tethered
to a particular output modality, we believe the sensor form
factors we explored could be readily coupled with a small digital
projector. There are two nice properties of wearing such a
projection device on the arm: 1) the arm is a relatively rigid
structurethe projector, when attached appropriately, will
naturally track with the arm; 2) since we have fine-grained
control of the arm, making minute adjustments to align the
projected image with the arm is trivial (e.g., projected
horizontal stripes for alignment with the wrist and elbow).

Challenges and Opportunities

Using the human body as the interaction platform has several
obvious advantages. Foremost, it is great that we can assume a
consistent, reliable, and always-available surface. We take our
bodies everywhere we go (or rather it takes us). Furthermore, we
are intimately familiar with our bodies, and proprioceptive
senses allow us to interact even in harsh circumstances (like a
moving bus). We can quickly and easily make finger gestures or
tap on a part of our body, even when we cannot see it and are on
the move.

"Who would have imagined when mobile data
access became prevalent that we'd be able to price shop
on-the-fly? Or resolve a bar debate on sports statistics with a
quick Wikipedia search? Imagine what we could enable with
seamless and even greater access to information and computing
power."

That said, using the signals generated by or transmitted
through the body as a means of intentional control comes with
various new challenges and opportunities for innovation. From a
technical perspective, building models of these signals that work
across multiple users and multiple sessions with minimal
calibration is often challenging. Most of our current work is
calibrated and trained each time the user dons the device, and
while these individual models work surprisingly well across
different body types, we recognize that this overhead of training
is not acceptable for real world use. Furthermore, regardless of
universality of the models, processing the often-noisy signals
coming from these sensors is not trivial and will likely never
yield perfect results. This is true because of the complexity of
the noise patterns as users move through different environments,
perform different tasks, and as the physiological signals changes
throughout the course of their normal activities. Hence,
interaction techniques must be carefully designed to tolerate or
even take advantage of imperfect interaction input.

On the interaction design front, there are many problems that
must be addressed. For example, the system must provide enough
affordances that the user can learn the new system. This is not
specific to physiological sensing, though the level of indirect
interpretation of signals can sometimes make end-user debugging
difficult, especially when the system does not act as it is
expected to. The interface must also be designed to handle the
"midas touch" problem, in which interaction is unintentionally
triggered when the user performs everyday tasks like turning a
doorknob. We have purposely designed our gesture sets in order to
minimize this, but we imagine there are more graceful
solutions.

In fact, with many interaction modalities, our first instinct
is often to emulate existing modalities (e.g., mouse and
keyboard) and use it to control existing interfaces. However, the
special affordances found in the mobile scenario bring with it
enough deviations from our traditional assumptions that we must
be diligent in designing for it. We should also emphasize the
importance of designing these systems so that they operate
seamlessly with other modalities and devices that the user
carries with them.

Desney Tan is a senior researcher at Microsoft Research, where
he manages the Computational User Experiences group in Redmond,
Washington and the Human-Computer Interaction group in Beijing,
China. He has won awards for his work on physiological computing
and healthcare, including a 2007 MIT TR35 Young Innovators award,
SciFi Channel's Young Visionaries at TED 2009, and named to
Forbes' Revolutionaries list in 2009. He will chair the CHI 2011
Conference, which will be held in Vancouver, BC.

Dan Morris is a researcher in the Computational User
Experiences group in Microsoft Research. His research interests
include computer support for musical composition, using
physiological signals for input, and improving within-visit
information accessibility for hospital patients. Dan received his
PhD in Computer Science from Stanford University in 2006.

T. Scott Saponas is a PhD candidate in the Computer Science
and Engineering department at the University of Washington. His
research interests include Human-Computer Interaction (HCI),
Ubiquitous Computing (UbiComp), and Physiological Computing.
Scott received his B.S. in Computer Science from the Georgia
Institute of Technology in 2004.

Permission to make digital or hard copies of all or part of
this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or
commercial advantage and that copies bear this notice and the
full citation on the first page. To copy otherwise, to republish,
to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.