CHI 1999-05-15 Volume 1

Groupware

In this paper, we report the results of an empirical study of how people, as
part of their daily work activities, go about to establish collaboration. We
examine the empirical findings and relate them to existing research on CSCW
session management models, i.e., the mechanisms in CSCW systems that define the
way in which people can join together in collaboration. Existing models leave
a lot to be desired, in particular because they tend to assume that indexical
elements of interaction management are substitutable by objective
representation of artifacts. Based on the empirical findings, we derive three
principles to consider in the design of CSCW session management models.

Although current online chat environments provide new opportunities for
communication, they are quite constrained in their ability to convey many
important pieces of social information, ranging from the number of participants
in a conversation to the subtle nuances of expression that enrich face to face
speech. In this paper we present Chat Circles, an abstract graphical interface
for synchronous conversation. Here, presence and activity are made manifest by
changes in color and form, proximity-based filtering intuitively breaks large
groups into conversational clusters, and the archives of a conversation are
made visible through an integrated history interface. Our goal in this work is
to create a richer environment for online discussions.

Designing and deploying groupware is difficult. Groupware evaluation and
design are often approached from a single perspective, with a technologically-,
individually-, or socially-centered focus. A study of Groupware Calendar
Systems (GCSs) highlights the need for a synthesis of these multiple
perspectives to fully understand the adoption challenges these systems face.
First, GCSs often replace existing calendar artifacts, which can impact users'
calendaring habits and in turn influence technology adoption decisions.
Second, electronic calendars have the potential to easily share contextualized
information publicly over the computer network, creating opportunities for peer
judgment about time allocation and raising concerns about privacy regulation.
However, this situation may also support coordination by allowing others to
make useful inferences about one's schedule. Third, the technology and the
social environment are in a reciprocal, co-evolutionary relationship: the use
context is affected by the constraints and affordances of the technology, and
the technology also co-adapts to the environment in important ways. Finally,
GCSs, despite being below the horizon of everyday notice, can affect the nature
of temporal coordination beyond the expected meeting scheduling practice.

Alternatives to QWERTY

The design and evaluation of a high performance soft keyboard for mobile
systems are described. Using a model to predict the upper-bound text entry
rate for soft keyboards, we designed a keyboard layout with a predicted
upper-bound entry rate of 58.2 wpm. This is about 35% faster than the
predicted rate for a QWERTY layout. We compared our design ("OPTI") with a
QWERTY layout in a longitudinal evaluation using five participants and 20
45-minute sessions of text entry. Average entry rates for OPTI increased from
17.0 wpm initially to 44.3 wpm at session 20. The average rates exceeded those
for the QWERTY layout after the 10th session (about 4 hours of practice). A
regression equation (R² = .997) in the form of the power-law of learning
predicts that our upper-bound prediction would be reach at about session 50.

Using traditional mobile input devices results in decreased effectiveness
and efficiency. To improve usability issues a portable Non-Keyboard QWERTY
touch-typing paradigm that supports the mobile touch-typing user is presented
and investigated. It requires negligible training time. Pressure sensors
strapped to the fingertips of gloves detect which finger is depressed. A
language model based on lexical and syntactic knowledge transforms the
depressed finger stroke sequence into real words and sentences. Different
mobile input QWERTY paradigms (miniaturised, floating and Non-Keyboard) have
been compared with full-size QWERTY. Among the mobile input paradigms, the
Non-Keyboard fared significantly better, both regarding character error rate
and subjective ratings.

Interest in pen-based user interfaces is growing rapidly. One potentially
useful feature of pen-based user interfaces is gestures, that is, a mark or
stroke that causes a command to execute. Unfortunately, it is difficult to
design gestures that are easy 1) for computers to recognize and 2) for humans
to learn and remember. To investigate these problems, we built a prototype
tool typical of those used for designing gesture sets. An experiment was then
performed to gain insight into the gesture design process and to evaluate this
style of tool. The experiment confirmed that gesture design is very difficult
and suggested several ways in which current tools can be improved. The most
important improvement is to make the tools more active and provide more
guidance for designers. This paper describes the gesture design tool, the
experiment, and its results.

Object Manipulation Studies in Virtual Environments

An experiment was conducted to systematically investigate combined effects
of controller, cursor and target size on multidimensional object manipulation
in a virtual environment. It was found that it was the relative size of
controller, cursor and target that significantly affected object transportation
and orientation processes. There were significant interactions between
controller size and cursor size as well as between cursor size and target size
on the total task completion time, transportation time, orientation time and
spatial errors. The same size of controller and cursor improved object
manipulation speed, and the same size of cursor and target generally
facilitated object manipulation accuracy, regardless of their absolute sizes.
Implications of these findings for human-computer interaction design are
discussed.

We explore the use of the non-dominant hand to control a virtual camera
while the dominant hand performs other tasks in a virtual 3D scene. Two
experiments and an informal study are presented which evaluate this interaction
style by comparing it to the status-quo unimanual interaction. In the first
experiment, we find that for a target selection task, performance using the
bimanual technique was 20% faster. Experiment 2 compared performance in a more
complicated object docking task. Performance advantages are shown, however,
only after practice. Free-form 3D painting was explored in the user study. In
both experiments and in the user study participants strongly preferred the
bimanual technique. The results also indicate that user preferences concerning
bimanual interaction may be driven by factors other than simple time-motion
performance advantages.

This paper reports empirical results from a study into the use of 2D widgets
in 3D immersive virtual environments. Several researchers have proposed the
use of 2D interaction techniques in 3D environments, however little empirical
work has been done to test the usability of such approaches. We present the
results of two experiments conducted on low-level 2D manipulation tasks within
an immersive virtual environment. We empirically show that the addition of
passive-haptic feedback for use in precise UI manipulation tasks can
significantly increase user performance. Furthermore, users prefer interfaces
that provide a physical surface, and that allow them to work with interface
widgets in the same visual field of view as the objects they are modifying.

We take as our premise that it is possible and desirable to design systems
that support social processes. We describe Loops, a project which takes this
approach to supporting computer-mediated communication (CMC) through structural
and interactive properties such as persistence and a minimalist graphical
representation of users and their activities that we call a social proxy. We
discuss a prototype called "Babble" that has been used by our group for over a
year, and has been deployed to six other groups at the Watson labs for about
two months. We describe usage experiences, lessons learned, and next steps.

Given the importance of credibility in computing products, the research on
computer credibility is relatively small. To enhance knowledge about computers
and credibility, we define key terms relating to computer credibility,
synthesize the literature in this domain, and propose three new conceptual
frameworks for better understanding the elements of computer credibility. To
promote further research, we then offer two perspectives on what computer users
evaluate when assessing credibility. We conclude by presenting a set of
credibility-related terms that can serve in future research and evaluation
endeavors.

The past decades have seen huge improvements in computer systems but these
have proved difficult to translate into comparable improvements in the
usability and social integration) of computers. We believe that the problem is
a deeply rooted set of assumptions about how computer systems should be
designed, and about who should be doing that design.
Human organizations are continually evolving to meet changing circumstances
of resource and need. In contrast, computers are quite rigid, incapable of
adaptation on their own. Therefore when computer systems are incorporated into
human organizations, those organizations must adapt the computers to changing
circumstances. This adaptation is another human activity that technology
should support, but our design philosophies are oddly silent about it.
This paper explores the origins of these problems in the norms developed for
managing human organizations, proposes partial solutions that can be
implemented with current systems technology, and speculates about the long-term
potential for radical improvements in system design.

We developed four widely different interfaces for users of Somewire, a
prototype audio-only media space. We informally studied users' experiences
with the two screen-based interfaces. We prototyped a non-screen-based
interface as an example of a novel tangible interface for a communication
system. We explored the conflict between privacy and simplicity of
representation, and identified two unresolved topics: the role of audio quality
and the prospects for scaling audio spaces beyond a single workgroup. Finally,
we formulated a set of design guidelines for control and representation in
audio spaces, as follows: GUIs are not well-suited to audio spaces, users do
not require control over localization or other audio attributes, and awareness
of other users' presence is desirable.

"Whisper" is a new wrist-worn handset, which is used by inserting the
fingertip into the ear canal. A received signal is conveyed from a
wrist-mounted actuator to the ear canal via the hand and a finger by bone
conduction. The user's voice is captured by a microphone mounted on the inside
of the wrist. All components of Whisper can be mounted on the wrist, and
usability does not decrease if the size of components is miniaturized. So,
both wearability and usability can be achieved together. The way Whisper is
operated is similar to that of an ordinary telephone handset. Thus, onlookers
may not look upon Whisper's operation as "talking to oneself", even if the
associated PDA is controlled by voice commands. Whisper is especially
effective in a noisy environment. Signals received via bone conduction can be
heard clearly in the presence of noise without raising the volume (-12 dB at
noise = 90 dB(A) in comparison to cellular phone handset). Whisper is also
effective in avoiding the annoying problem of the user's voice being raised in
a noisy situation. Feedback of the user's utterance is boosted by bone
conduction when covering the ear canal with a fingertip, then the user's voice
does not need to raised in the presence of noise (-6 dB at noise = 90 dB(A) in
comparison to cellular phone handset). Whisper is useful as a voice interface
for a wrist-worn PDA and cellular phone.

We describe the i-LAND environment which constitutes an example of our
vision of the workspaces of the future, in this case supporting cooperative
work of dynamic teams with changing needs. i-LAND requires and provides new
forms of human-computer interaction and new forms of computer-supported
cooperative work. Its design is based on an integration of information and
architectural spaces, implications of new work practices and an empirical
requirements study informing our design. i-LAND consists of several 'roomware'
components, i.e. computer-augmented objects integrating room elements with
information technology. We present the current realization of i-LAND in terms
of an interactive electronic wall, an interactive table, two computer-enhanced
chairs, and two "bridges" for the Passage-mechanism. This is complemented by
the description of the creativity support application and the technological
infrastructure. The paper is accompanied by a video figure in the CHI'99 video
program.

This paper describes the evolution, implementation, and use of logjam, a
system for video logging. The system features a game-board that senses the
location and identities of pieces placed upon it. The board is the interface
that enables a group of people to log video footage together. We report on
some of the surprising physical and social dynamics that we have observed in
multi-person logging sessions using the system.

With the proliferation of online multimedia content and the popularity of
multimedia streaming systems, it is increasingly useful to be able to skim and
browse multimedia quickly. A key technique that enables quick browsing of
multimedia is time-compression. Prior research has described how speech can be
time-compressed (shortened in duration) while preserving the pitch of the
audio. However, client-server systems providing this functionality have not
been available.
In this paper, we first describe the key tradeoffs faced by designers of
streaming multimedia systems deploying time-compression. The implementation
tradeoffs primarily impact the granularity of time-compression supported
(discrete vs. continuous) and the latency (wait-time) experienced by users
after adjusting degree of time-compression. We report results of user studies
showing impact of these factors on the average-compression-rate achieved. We
also present data on the usage patterns and benefits of time compression.
Overall, we show significant time-savings for users and that considerable
flexibility is available to the designers of client-server streaming systems
with time compression.

Characters and Agents

Programmable Embodied Agents are portable, wireless, interactive devices
embodying specific, differentiable, interactive characteristics. They take the
form of identifiable characters who reside in the physical world and interact
directly with users. They can act as an out-of-band communication channel
between users, as proxies for system components or other users, or in a variety
of other roles. Traditionally, research into such devices has been based on
costly custom hardware. In this paper, we report on our explorations of the
space of physical character-based interfaces built on recently available stock
consumer hardware platforms, structured around an initial framework of
applications.

We introduce the concept of a sympathetic interface for controlling an
animated synthetic character in a 3D virtual environment. A plush doll
embedded with wireless sensors is used to manipulate the virtual character in
an iconic and intentional manner. The interface extends from the novel
physical input device through interpretation of sensor data to the behavioral
"brain" of the virtual character. We discuss the design of the interface and
focus on its latest instantiation in the Swamped! exhibit at SIGGRAPH '98. We
also present what we learned from hundreds of casual users, who ranged from
young children to adults.

Recent debate has centered on the relative promise of focusing
user-interface research on developing new metaphors and tools that enhance
users' abilities to directly manipulate objects versus directing effort toward
developing interface agents that provide automation. In this paper, we review
principles that show promise for allowing engineers to enhance human-computer
interaction through an elegant coupling of automated services with direct
manipulation. Key ideas will be highlighted in terms of the LookOut system for
scheduling and meeting management.

Progress in Drawing and CAD

Rotating a piece of paper while drawing is an integral and almost
subconscious part of drawing with pencil and paper. In a similar manner, the
advent of lightweight pen-based computers allow digital artwork to be rotated
while drawing by rotating the entire computer. Given this type of manipulation
we explore the implications for the user interface to support artwork
orientation. First we describe an exploratory study to further motivate our
work and characterize how artwork is manipulated while drawing. After
presenting some possible UI approaches to support artwork orientation, we
define a new solution called a rotating user interface (RUIs). We then discuss
design issues and requirements for RUIs based on our exploratory study.

Current object-oriented drawing programs have an established way of drawing
in which the shape of an object is controlled by manipulating control points.
While the control points are intuitive in their basic use, it is not clear
whether they make more complex drawing tasks manageable for the average user.
In this paper we describe an alternative way of drawing and editing a drawing
using new direct manipulation tools. Our approach resembles sculpting in two
dimensions: the user begins with a large block and uses different tools to give
it the desired shape. We also present a user evaluation in which the users
could try our new tools and compare them to their previous experience of
control points. The users claimed to understand the operations better with our
tools than if they had needed to use curves and control points. However, our
tools were better suited for sketching the artwork than for making very
detailed drawings.

The inefficient use of complex computer systems has been widely reported.
These studies show the persistence of inefficient methods despite many years of
experience and formal training. To counteract this phenomenon, we present the
design of a new course, called the Strategic Use of CAD. The course aims at
teaching students efficient strategies to use a computer-aided drafting system
through a two-pronged approach. Learning to See teaches students to recognize
opportunities to use efficient strategies by studying the nature of the task,
and Learning to Do teaches students to implement the strategies. Results from
a pilot experiment show that this approach had a positive effect on the
strategic behavior of students who did not exhibit knowledge of efficient
strategies before the class, and had no effect on the strategic behavior of
those who did. Strategic training can thus assist users in recognizing
opportunities to use efficient strategies. We present the ramifications of
these results on the design of training and future experiments.

Keywords: CAD, Strategy, Training, GOMS, Learning, Efficiency

Programming Techniques and Issues

This paper describes an architecture for supporting interface attachments --
small interactive programs which are designed to augment the functionality of
other applications. This architecture is designed to work with a diverse set
of conventional applications, but require only a minimal set of "hooks" into
those applications. In order to achieve this, the work described here
concentrates on what we will call observational attachments, a subclass of
attachments that operate primarily by observing and manipulating the surface
representations of applications -- that is the visual information that
applications would normally display on the screen or print. These attachments
can be thought of as "looking" over the shoulder of the user" to assist with
various tasks. By requiring very little modification to, or help from, the
applications they augment, this approach supports the creation of a set of
uniform services that can be applied across a more diverse set of applications
than traditional approaches.

The VisMap system provides for "visual manipulation" of arbitrary
off-the-shelf applications, through an application's graphical user interface.
VisMap's API-independent control has advantages for tasks that can benefit from
direct access to the functions of the user interface. We describe the design
goals and architecture of the system, and we discuss two applications, a
user-controlled visual scripting program and an autonomous solitaire-playing
program, which together demonstrate some of the capabilities and limitations of
the approach.

Should programming languages use natural-language-like syntax? Under what
circumstances? What sorts of errors do novice programmers make? Does using a
natural-language-like programming language lead to user errors? In this study,
we read the entire online interactions of sixteen children who issued a total
of 35,047 commands on MOOSE Crossing, an educational MUD for children, We
counted and categorized the errors made. A total of 2,970 errors were
observed. We define "natural-language errors" as those errors in which the
user failed to distinguish between English and code, issuing an incorrect
command that was more English-like than the correct one. A total of 314
natural-language errors were observed. In most of those errors, the child was
able to correct the problem either easily (41.1% of the time) or with some
effort (20.7%). Natural-language errors were divided into five categories. In
order from most to least frequent, they are: syntax errors, guessing a command
name by supplying an arbitrary English word, literal interpretation of
metaphor, assuming the system is keeping more state information than is
actually the case, and errors of operator precedence and combination. We
believe that these error rates are within acceptable limits, and conclude that
leveraging users' natural-language knowledge is for many applications an
effective strategy for designing end-user-programming languages.

Touching, Pointing, and Choosing

The ISO 9241, Part 9 Draft International Standard for testing computer
pointing devices proposes an evaluation of performance and comfort. In this
paper we evaluate the scientific validity and practicality of these dimensions
for two pointing devices for laptop computers, a finger-controlled isometric
joystick and a touchpad. Using a between-subjects design, evaluation of
performance using the measure of throughput was done for one-direction and
multi-directional pointing and selecting. Results show a significant
difference in throughput for the multi-directional task, with the joystick 27%
higher; results from the one-direction task were non-significant. After the
experiment, participants rated the device for comfort, including operation,
fatigue, and usability. The questionnaire showed no overall difference in the
responses, and a significant statistical difference in only the question
concerning force required to operate the device -- the joystick requiring
slightly more force. The paper concludes with a discussion of problems in
implementing the ISO standard and recommendations for improvement.

We can touch things, and our senses tell us when our hands are touching
something. But most computer input devices cannot detect when the user touches
or releases the device or some portion of the device. Thus, adding touch
sensors to input devices offers many possibilities for novel interaction
techniques. We demonstrate the TouchTrackball and the Scrolling TouchMouse,
which use unobtrusive capacitance sensors to detect contact from the user's
hand without requiring pressure or mechanical actuation of a switch. We
further demonstrate how the capabilities of these devices can be matched to an
implicit interaction technique, the On-Demand Interface, which uses the passive
information captured by touch sensors to fade in or fade out portions of a
display depending on what the user is doing; a second technique uses explicit,
intentional interaction with touch sensors for enhanced scrolling. We present
our new devices in the context of a simple taxonomy of tactile input
technologies. Finally, we discuss the properties of touch-sensing as an input
channel in general.

The proliferation of multiple toolbars and UI widgets around the perimeter
of application windows is an indication that the traditional GUI design of a
single menubar is not sufficient to support large scale applications with
numerous functions. In this paper we describe a new widget which is an
enhancement of the traditional menubar which dramatically increases menu-item
capacity. This widget, called the "Hotbox" combines several GUI techniques
which are generally used independently: accelerator keys, modal dialogs,
pop-up/pull down menus, radial menus, marking menus and menubars. These
techniques are fitted together to create a single, easy to learn yet fast to
operate GUI widget which can handle significantly more menu-items than the
traditional GUI menubar. We describe the design rationale of the Hotbox and
its effectiveness in a large scale commercial application. While the Hotbox
was developed for a particular application domain, the widget itself and the
design rationale are potentially useful in other domains.

Gaze and Purpose

Human interfaces are usually designed to respond only to intentional human
behaviors. However, humans show unintentional behaviors as well. They can
convey useful information to realize user-friendly human interfaces. This
paper presents how to combine observations of both types of behaviors by taking
two human-machine systems: a gesture-based interface and an intelligent
wheelchair. In the first system, intentional hand gestures are chosen using
unintentional behaviors. In the second system, near unintentional behaviors
following intentional behaviors can be used to control the wheelchair motion.
Experimental systems working in real time have been developed. Operational
experiments prove our approach promising.

This work explores a new direction in utilizing eye gaze for computer input.
Gaze tracking has long been considered as an alternative or potentially
superior pointing method for computer input. We believe that many fundamental
limitations exist with traditional gaze pointing. In particular, it is
unnatural to overload a perceptual channel such as vision with a motor control
task. We therefore propose an alternative approach, dubbed MAGIC (Manual And
Gaze Input Cascaded) pointing. With such an approach, pointing appears to the
user to be a manual task, used for fine manipulation and selection. However, a
large portion of the cursor movement is eliminated by warping the cursor to the
eye gaze area, which encompasses the target. Two specific MAGIC pointing
techniques, one conservative and one liberal, were designed, analyzed, and
implemented with an eye tracker we developed. They were then tested in a pilot
study. This early-stage exploration showed that the MAGIC pointing techniques
might offer many advantages, including reduced physical effort and fatigue as
compared to traditional manual pointing, greater accuracy and naturalness than
traditional gaze pointing, and possibly faster speed than manual pointing. The
pros and cons of the two techniques are discussed in light of both performance
data and subjective reports.

While current eye-based interfaces offer enormous potential for efficient
human-computer interaction, they also manifest the difficulty of inferring
intent from user eye movements. This paper describes how fixation tracing
facilitates the interpretation of eye movements and improves the flexibility
and usability of eye-based interfaces. Fixation tracing uses hidden Markov
models to map user actions to the sequential predictions of a cognitive process
model. In a study of eye typing, results show that fixation tracing generates
significantly more accurate interpretations than simpler methods and allows for
more flexibility in designing usable interfaces. Implications for future
research in eye-based interfaces and multimodal interfaces are discussed.

Foundations for Navigation

This paper reports on Direct Combination, a new user interaction technique.
Direct Combination may be viewed variously as: a systematic extension to Direct
Manipulation; a concise navigational framework to help users find the
operations they need; and as a framework to make a greater range and variety of
operations available to the user, without overburdening user or interface
designer. While Direct Combination may be seen as an extension of Direct
Manipulation, it may also be applied to a wide range of user interaction
styles, including even command line interfaces. Examples from various
hypothetical systems and from an implemented system are presented. This paper
argues that Direct Combination is applicable not just to problem seeking or
design oriented domains (where the technique originated) but is generally
applicable. A variety of new interaction styles for Direct Combination are
presented. The generalisation of Direct Combination to the n-dimensional case
is presented.

Inspired by Hill and Hollan's original work [7], we have been developing a
theory of interaction history and building tools to apply this theory to
navigation in a complex information space. We have built a series of tools --
map, paths, annotations and signposts -- based on a physical-world navigation
metaphor. These tools have been in use for over a year. Our user study
involved a controlled browse task and showed that users were able to get the
same amount of work done with significantly less effort.

Unfamiliar, large-scale virtual environments are difficult to navigate.
This paper presents design guidelines to ease navigation in such virtual
environments. The guidelines presented here focus on the design and placement
of landmarks in virtual environments. Moreover, the guidelines are based
primarily on the extensive empirical literature on navigation in the real
world. A rationale for this approach is provided by the similarities between
navigational behavior in real and virtual environments.

Working with People Near and Far

We introduce a model for supporting collaborative work between people that
are physically close to each other. We call this model Single Display
Groupware (SDG). In this paper, we describe the model, comparing it to more
traditional remote collaboration. We describe the requirements that SDG places
on computer technology, and our understanding of the benefits and costs of SDG
systems. Finally, we describe a prototype SDG system that we built and the
results of a usability test we ran with 60 elementary school children.

In this paper, we discuss why, in designing multiparty mediated systems, we
should focus first on providing non-verbal cues which are less redundantly
coded in speech than those normally conveyed by video. We show how conveying
one such cue, gaze direction, may solve two problems in multiparty mediated
communication and collaboration: knowing who is talking to whom, and who is
talking about what. As a candidate solution, we present the GAZE Groupware
System, which combines support for gaze awareness in multiparty mediated
communication and collaboration with small and linear bandwidth requirements.
The system uses an advanced, desk-mounted eyetracker to metaphorically convey
gaze awareness in a 3D virtual meeting room and within shared documents.

More and more organizations are forming teams that are not co-located.
These teams communicate via email, fax, telephone and audio conferences, and
sometimes video. The question often arises whether the cost of video is worth
it. Previous research has shown that video makes people more satisfied with
the work, but it doesn't help the quality of the work itself. There is one
exception; negotiation tasks are measurably better with video. In this study,
we show that the same effect holds for a more subtle form of negotiation, when
people have to negotiate meaning in a conversation. We compared the
performance and communication of people explaining a map route to each other.
Half the pairs have video and audio connections, half only audio. Half of the
pairs were native speakers of English; the other half were non-native speakers,
those presumably who have to negotiate meaning more. The results showed that
non-native speaker pairs did benefit from the video; native speakers did not.
Detailed analysis of the conversational strategies showed that with video, the
non-native speaker pairs spent proportionately more effort negotiating common
ground.

Stories and Narratives

Narrative is fundamental to the ways we make sense of texts of all kinds
because it provides structure and coherence, but it is difficult to see how
this works in the context of multimedia interactive learning environments
(MILEs). We tested our hypotheses about the form and function of narrative in
MILEs by developing three versions of material on CD-ROM which had different
narrative structures and analysed the impact of the different versions on
learner behaviour. We present a theoretical framework in which we explain the
concepts of narrative guidance and narrative construction and their application
to the design of MILEs.

Interactive software is currently used for learning and entertainment
purposes. This type of software is not very common among blind children
because most computer games and electronic toys do not have appropriate
interfaces to be accessible without visual cues.
This study introduces the idea of interactive hyperstories carried out in a
3D acoustic virtual world for blind children. We have conceptualized a model
to design hyperstories. Through AudioDoom we have an application that enables
testing cognitive tasks with blind children. The main research question
underlying this work explores how audio-based entertainment and spatial sound
navigable experiences can create cognitive spatial structures in the minds of
blind children.
AudioDoom presents first person experiences through exploration of
interactive virtual worlds by using only 3D aural representations of the space.

We have begun the development of a new robotic pet that can support children
in the storytelling process. Children can build their own pet by snapping
together the modular animal parts of the PETS robot. After their pet is built,
children can tell stories using the My Pets software. These stories can then
be acted out by their robotic pet. This video paper describes the motivation
for this research and the design process of our intergenerational design team
in building the first PETS prototypes. We will discuss our progress to date
and our focus for the future.

NotePals is a lightweight note sharing system that gives group members easy
access to each other's experiences through their personal notes. The system
allows notes taken by group members in any context to be uploaded to a shared
repository. Group members view these notes with browsers that allow them to
retrieve all notes taken in a given context or to access notes from other
related notes or documents. This is possible because NotePals records the
context in which each note is created (e.g., its author, subject, and creation
time). The system is "lightweight" because it fits easily into group members'
regular note-taking practices, and uses informal, ink-based user interfaces
that run on portable, inexpensive hardware. In this paper we describe
NotePals, show how we have used it to share our notes, and present our
evaluations of the system.

Flatland is an augmented whiteboard interface designed for informal office
work. Our research investigates approaches to building an augmented whiteboard
in the context of continuous, long term office use. In particular, we pursued
three avenues of research based on input from user studies: techniques for the
management of space on the board, the ability to flexibly apply behaviors to
support varied application semantics, and mechanisms for managing history on
the board. Unlike some previously reported whiteboard systems, our design
choices have been influenced by a desire to support long-term, informal use in
an individual office setting.

Tagging and Tracking Objects in Physical UIs

The Palette is a digital appliance designed for intuitive control of
electronic slide shows. Current interfaces demand too much of our attention to
permit effective computer use in situations where we can not give the
technology our fullest concentration. The Palette uses index cards that are
printed with slide content that is easily identified by both humans and
computers. The presenter controls the presentation by directly manipulating
the cards. The Palette design is based on our observation of presentations
given in a real work setting. Our experiences using the system are described,
including new practices (e.g., collaborative presentation, enhanced notetaking)
that arise from the affordances of this new approach. This system is an
example of a new interaction paradigm called tacit interaction that supports
users who can spare very little attention to a computer interface.

We present TouchCounters, an integrated system of electronic modules,
physical storage containers, and shelving surfaces for the support of
collaborative physical work. Through physical sensors and local displays,
TouchCounters record and display usage history information upon physical
storage containers, thus allowing access to this information during the
performance of real-world tasks. A distributed communications network allows
this data to be exchanged with a server, such that users can access this
information from remote locations as well.
Based upon prior work in ubiquitous computing and tangible interfaces,
TouchCounters incorporate new techniques, including usage history tracking for
physical objects and multi-display visualization. This paper describes the
components, interactions, implementation, and conceptual approach of the
TouchCounters system.

The role of computers in the modern office has divided our activities
between virtual interactions in the realm of the computer and physical
interactions with real objects within the traditional office infrastructure.
This paper extends previous work that has attempted to bridge this gap, to
connect physical objects with virtual representations or computational
functionality, via various types of tags. We discuss a variety of scenarios we
have implemented using a novel combination of inexpensive, unobtrusive and easy
to use RFID tags, tag readers, portable computers and wireless networking.
This novel combination demonstrates the utility of invisibly, seamlessly and
portably linking physical objects to networked electronic services and actions
that are naturally associated with their form.

Augmented Surfaces

This paper describes our design and implementation of a computer augmented
environment that allows users to smoothly interchange digital information among
their portable computers, table and wall displays, and other physical objects.
Supported by a camera-based object recognition system, users can easily
integrate their portable computers with the pre-installed ones in the
environment. Users can use displays projected on tables and walls as a
spatially continuous extension of their portable computers. Using an
interaction technique called hyperdragging, users can transfer information from
one computer to another, by only knowing the physical relationship between
them. We also provide a mechanism for attaching digital data to physical
objects, such as a videotape or a document folder, to link physical and digital
spaces.

We introduce a system for urban planning -- called Urp -- that integrates
functions addressing a broad range of the field's concerns into a single,
physically based workbench setting. The I/O Bulb infrastructure on which the
application is based allows physical architectural models placed on an ordinary
table surface to cast shadows accurate for arbitrary times of day; to throw
reflections off glass facade surfaces; to affect a real-time and visually
coincident simulation of pedestrian-level windflow; and so on.
We then use comparisons among Urp and several earlier I/O Bulb applications
as the basis for an understanding of luminous-tangible interactions, which
result whenever an interface distributes meaning and functionality between
physical objects and visual information projectively coupled to those objects.
Finally, we briefly discuss two issues common to all such systems, offering
them as informal thought-tools for the design and analysis of luminous-tangible
interfaces.

This paper introduces a novel interface for digitally-augmented cooperative
play. We present the concept of the "athletic-tangible interface," a new class
of interaction which uses tangible objects and full-body motion in physical
spaces with digital augmentation. We detail the implementation of
PingPongPlus, a "reactive ping-pong table", which features a novel sound-based
ball tracking technology. The game is augmented and transformed with dynamic
graphics and sound, determined by the position of impact, and the rhythm and
style of play. A variety of different modes of play and initial experiences
with PingPongPlus are also described.

Cognitive Models of Screen Interaction

Click-down (or pull-down) menus have long been a key component of graphical
user interfaces, yet we know surprisingly little about how users actually
interact with such menus. Nilsen's [8] study on menu selection has led to the
development of a number of models of how users perform the task [6, 21.
However, the validity of these models has not been empirically assessed with
respect to eye movements (though [1] presents some interesting data that bear
on these models). The present study is an attempt to provide data that can
help refine our understanding of how users interact with such menus.

This research presents cognitive models of a person selecting an item from a
familiar, ordered, pull-down menu. Two different models provide a good fit
with human data and thus two different possible explanations for the low-level
cognitive processes involved in the task. Both models assert that people make
an initial eye and hand movement to an anticipated target location without
waiting for the menu to appear. The first model asserts that a person knows
the exact location of the target item before the menu appears, but the model
uses nonstandard Fitts' law coefficients to predict mouse pointing time. The
second model asserts that a person would only know the approximate location of
the target item, and the model uses Fitts' law coefficients better supported by
the literature. This research demonstrates that people can develop
considerable knowledge of locations in a visual task environment, and that more
work regarding Fitts' law is needed.

Models of learning and performing by exploration assume that the semantic
similarity between task descriptions and labels on display objects (e.g.,
menus, tool bars) controls in part the users' search strategies. Nevertheless,
none of the models has an objective way to compute semantic similarity. In
this study, Latent Semantic Analysis (LSA) was used to compute semantic
similarity between task descriptions and labels in an application's menu
system. Participants performed twelve tasks by exploration and they were
tested for recall after a 1-week delay. When the labels in the menu system
were semantically similar to the task descriptions, subjects performed the
tasks faster. LSA could be incorporated into any of the current models, and it
could be used to automate the evaluation of computer applications for ease of
learning and performing by exploration.

Tools for Building Interfaces and Applications

Interface builders are popular tools for designing and developing graphical
user interfaces. These tools, however, are engineering-centered; they operate
mainly on windows and widgets. A typical interface builder does not offer any
specific support for user-centered interface design, a methodology recognized
as critical for effective user interface design. We present MOBILE
(Model-Based Interface Layout Editor) an interface building tool that fully
supports user-centered design and that guides the interface building process by
using user-task models and a knowledge base of interface design guidelines.
The approach in MOBILE has the important added benefit of being useful in both
top-down and bottom-up interface design strategies.

Context-enabled applications are just emerging and promise richer
interaction by taking environmental context into account. However, they are
difficult to build due to their distributed nature and the use of
unconventional sensors. The concepts of toolkits and widget libraries in
graphical user interfaces has been tremendously successful, allowing
programmers to leverage off existing building blocks to build interactive
systems more easily. We introduce the concept of context widgets that mediate
between the environment and the application in the same way graphical widgets
mediate between the user and the application. We illustrate the concept of
context widgets with the beginnings of a widget library we have developed for
sensing presence, identity and activity of people and things. We assess the
success of our approach with two example context-enabled applications we have
built and an existing application to which we have added context-sensing
capabilities.

Programming-by-demonstration (PBD) can be used to create tools and methods
that eliminate the need to learn difficult computer languages. Gamut is a PBD
tool that nonprogrammers can use to create a broader range of interactive
software, including games, simulations, and educational software, than they can
with other PBD tools. To do this, Gamut provides advanced interaction
techniques that make it easier for a developer to express all aspects of an
application. These techniques include a simplified way to demonstrate new
examples, called "nudges," and a way to highlight objects to show they are
important. Also, Gamut includes new objects and metaphors like the
deck-of-cards metaphor for demonstrating collections of objects and randomness,
guide objects for demonstrating relationships that the system would find too
difficult to guess, and temporal ghosts which simplify showing relationships
with the recent past. These techniques were tested in a formal setting with
nonprogrammers to evaluate their effectiveness.

Vision and Fitts' Law

Fitts' pointing model has proven extremely useful for understanding basic
selection in WIMP user interfaces. Yet today's interfaces involve more complex
navigation within electronic environments. As navigation amounts to a form of
multi-scale pointing, Fitts' model can be applied to these more complex tasks.
We report the results of a preliminary pointing experiment that shows that
users can handle higher levels of task difficulty with two-scale rather than
traditional one-scale pointing control. Also, in tasks with very
high-precision hand movements, performance is higher with a stylus than with a
mouse.

This paper explores how 'contact points' or co-references between an
animation and text should be designed in web pages. Guidelines are derived
from an eye tracking study. A dynamic HTML authoring tool is described which
supports these requirements. An evaluation study is reported in which four
designs of animation in web pages were tested.

Keywords: Web page design, Authoring tools

Performance Evaluation of Input Devices in Trajectory-Based Tasks: An
Application of The Steering Law

Choosing input devices for interactive systems that best suit user's needs
remains a challenge, especially considering the increasing number of devices
available. The choice often has to be made through empirical evaluations. The
most frequently used evaluation task hitherto is target acquisition, a task
that can be accurately modeled by Fitts' law. However, today's use of computer
input devices has gone beyond target acquisition alone. In particular, we
often need to perform trajectory-based tasks, such as drawing, writing, and
navigation. This paper illustrates how a recently discovered model, the
steering law, can be applied as an evaluation paradigm complementary to Fitts'
law. We tested five commonly used computer input devices in two steering
tasks, one linear and one circular. Results showed that subjects' performance
with the five devices could be generally classified into three groups in the
following order: 1. the tablet and the mouse, 2. the trackpoint, 3. the
touchpad and the trackball. The steering law proved to hold for all five
devices with greater than 0.98 correlation. The ability to generalize the
experimental results and the limitations of the steering law are also
discussed.

Learning and Reading

We are exploring a new class of tools for learners: scaffolded integrated
tool environments (or SITEs), which address the needs of learners trying to
engage in new, complex work processes. A crucial phase within a
learner-centered design approach for SITE design involves analyzing the work
process to identify areas where learners need support to engage in the process.
Here we discuss the design of Symphony, a SITE for high-school science
students. Specifically, we discuss how the process-space model helped us
analyze the science inquiry process to help us identify a detailed set of
learner needs, leading to a full set of process scaffolding strategies for
Symphony.

Over the last two centuries, reading styles have shifted away from the
reading of documents from beginning to end and toward the skimming of documents
in search of relevant information. This trend continues today where readers,
often confronted with an insurmountable amount of text, seek more efficient
methods of extracting relevant information from documents. In this paper, a
new document reading environment is introduced called the Reader's Helper,
which supports the reading of electronic and paper documents. The Reader's
Helper analyzes documents and produces a relevance score for each of the
reader's topics of interest, thereby helping the reader decide whether the
document is actually worth skimming or reading. Moreover, during the analysis
process, topic of interest phrases are automatically annotated to help the
reader quickly locate relevant information. A new information visualization
tool, called the Thumbar, is used in conjunction with relevancy scoring and
automatic annotation to portray a continuous, dynamic thumb-nail representation
of the document. This further supports rapid navigation of the text.

This paper describes a research study that investigated how designers can
use frames of reference (egocentric, exocentric, and a combination of the two)
to support the mastery of abstract multidimensional information. The primary
focus of this study was the relationship between FORs and mastery; the
secondary focus was on other factors (individual characteristics and
interaction experience) that were likely to influence the relationship between
FORs and mastery. This study's outcomes (1) clarify how FORs work in
conjunction with other factors in shaping mastery, (2) highlight strengths and
weaknesses of different FORs, (3) demonstrate the benefits of providing
multiple FORs, and (4) provide the basis for our recommendations to HCI
researchers and designers.

Navigation and Visualization

FotoFile is an experimental system for multimedia organization and
retrieval, based upon the design goal of making multimedia content accessible
to non-expert users. Search and retrieval are done in terms that are natural
to the task. The system blends human and automatic annotation methods. It
extends textual search, browsing, and retrieval technologies to support
multimedia data types.

Multi-focus distortion-oriented views are useful in viewing large
information on a small screen, but still have problems in managing multiple
foci during editing. The user may have to navigate information space by
focusing and defocusing multiple parts to obtain multi-focus layouts that
change according to various editing situations. As a result, it becomes
haphazard to navigate and edit large nested networks such as hypertexts. We
propose a user interface for quickly obtaining desirable layouts. The
interface uses two techniques: focus size prediction and predictive focus
selection. These techniques are based on a user test and experiences in
applications. We also describe two example applications.

The widespread use of information visualization is hampered by the lack of
effective labeling techniques. An informal taxonomy of labeling methods is
proposed. We then describe "excentric labeling", a new dynamic technique to
label a neighborhood of objects located around the cursor. This technique does
not intrude into the existing interaction, it is not computationally intensive,
and was easily applied to several visualization applications. A pilot study
with eight subjects indicates a strong speed benefit over a zoom interface for
tasks that involve the exploration of large numbers of objects. Observations
and comments from users are presented.

Keywords: Visualization, Label, Dynamic labeling, Evaluation

Virtual Reality and Embodiment

In this paper, we argue for embodied conversational characters as the
logical extension of the metaphor of human-computer interaction as a
conversation. We argue that the only way to fully model the richness of human
face-to-face communication is to rely on conversational analysis that describes
sets of conversational behaviors as fulfilling conversational functions, both
interactional and propositional. We demonstrate how to implement this approach
in Rea, an embodied conversational agent that is capable of both multimodal
input understanding and output generation in a limited application domain. Rea
supports both social and task-oriented dialogue. We discuss issues that need
to be addressed in creating embodied conversational agents, and describe the
architecture of the Rea interface.

Character-based social interfaces present a unique opportunity to integrate
emotion into technology interactions. The present paper reports on the use of
three emotional interactions (humor, praise, and affection) in the audio
interfaces for two character-based interactive learning toys. The reasons for
selecting the emotions used, the design rationale for their application, and
findings from usability testing are reviewed. It is suggested that as a form
of pretend play-acting akin to puppetry, social interfaces can engage the
emotions of users in a variety of beneficial ways.

A distributed immersive virtual environment was deployed as a component of a
pedagogical strategy for teaching third grade children that the Earth is round.
The displacement strategy is based on the theory that fundamental conceptual
change requires an alternative cognitive starting point which doesn't invoke
the features of pre-existing models. While the VR apparatus helped to
establish that alternative framework, conceptual change was strongly influenced
by the bridging activities which related that experience to the target domain.
Simple declarations of relevance proved ineffective. A more articulated
bridging process involving physical models was effective for some children, but
the multiple representations employed required too much model-matching for
others.

Organizing Information on the Web

A prerequisite to the effective design of user interfaces is an
understanding of the tasks for which that interface will actually be used.
Surprisingly little task analysis has appeared for one of the most discussed
and fastest-growing computer applications, browsing the World-Wide Web (WWW).
Based on naturally-collected verbal protocol data, we present a taxonomy of
tasks undertaken on the WWW. The data reveal that several previous claims
about browsing behavior are questionable, and suggests that that
widget-centered approaches to interface design and evaluation may be incomplete
with respect to good user interfaces for the Web.

Keywords: World-Wide Web, Task analysis, Video protocols

An Empirical Evaluation of User Interfaces for Topic Management of Web Sites

Topic management is the task of gathering, evaluating, organizing, and
sharing a set of web sites for a specific topic. Current web tools do not
provide adequate support for this task. We created the TopicShop system to
address this need. TopicShop includes (1) a webcrawler that discovers relevant
web sites and builds site profiles, and (2) user interfaces for exploring and
organizing sites. We conducted an empirical study comparing user performance
with TopicShop vs. Yahoo. TopicShop subjects found over 80% more high-quality
sites (where quality was determined by independent expert judgements) while
browsing only 81% as many sites and completing their task in 89% of the time.
The site profile data that TopicShop provides -- in particular, the number of
pages on a site and the number of other sites that link to it -- was the key to
these results, as users exploited it to identify the most promising sites
quickly and easily.

In this paper, we describe the use of similarity metrics in a novel visual
environment for storing and retrieving favorite web pages. The similarity
metrics, called Implicit Queries, are used to automatically highlight stored
web pages that are related to the currently selected web page. Two experiments
explored how users manage their personal web information space with and without
the Implicit Query highlighting and later retrieve their stored web pages.
When storing and organizing web pages, users with Implicit Query highlighting
generated slightly more categories. Implicit Queries also led to faster web
page retrieval time, although the results were not statistically significant.

Note: An earlier version of this article was included by mistake
in the printed volume of the CHI '99 Proceedings.
This, the correct version, is inserted her in the ACM Digital Library.
The original may be found at:
http://www.acm.org/pubs/articles/proceedings/chi/302979/p560-czerwinski/p560-czerwinski.printversion.pdf

Speech and Multimodal Interfaces

Patterns of Entry and Correction in Large Vocabulary Continuous Speech
Recognition Systems

A study was conducted to evaluate user performance and satisfaction in
completion of a set of text creation tasks using three commercially available
continuous speech recognition systems. The study also compared user
performance on similar tasks using keyboard input. One part of the study
(Initial Use) involved 24 users who enrolled, received training and carried out
practice tasks, and then completed a set of transcription and composition tasks
in a single session. In a parallel effort (Extended Use), four researchers
used speech recognition to carry out real work tasks over 10 sessions with each
of the three speech recognition software products. This paper presents results
from the Initial Use phase of the study along with some preliminary results
from the Extended Use phase. We present details of the kinds of usability and
system design problems likely in current systems and several common patterns of
error correction that we found.

As a new generation of multimodal/media systems begins to define itself,
researchers are attempting to learn how to combine different modes into
strategically integrated whole systems. In theory, well designed multimodal
systems should be able to integrate complementary modalities in a manner that
supports mutual disambiguation (MD) of errors and leads to more robust
performance. In this study, over 2,000 multimodal utterances by both native
and accented speakers of English were processed by a multimodal system, and
then logged and analyzed. The results confirmed that multimodal systems can
indeed support significant levels of MD, and also higher levels of MD for the
more challenging accented users. As a result, although speech recognition as a
stand-alone performed far more poorly for accented speakers, their multimodal
recognition rates did not differ from those of native speakers. Implications
are discussed for the development of future multimodal architectures that can
perform in a more robust and stable manner than individual recognition
technologies. Also discussed is the design of interfaces that support
diversity in tangible ways, and that function well under challenging real-world
usage conditions.

Our research addresses the problem of error correction in speech user
interfaces. Previous work hypothesized that switching modality could speed up
interactive correction of recognition errors (so-called multimodal error
correction). We present a user study that compares, on a dictation task,
multimodal error correction with conventional interactive correction, such as
speaking again, choosing from a list, and keyboard input. Results show that
multimodal correction is faster than conventional correction without keyboard
input, but slower than correction by typing for users with good typing skills.
Furthermore, while users initially prefer speech, they learn to avoid
ineffective correction modalities with experience. To extrapolate results from
this user study we developed a performance model of multimodal interaction that
predicts input speed including time needed for error correction. We apply the
model to estimate the impact of recognition technology improvements on
correction speeds and the influence of recognition accuracy and correction
method on the productivity of dictation systems. Our model is a first step
towards formalizing multimodal (recognition-based) interaction.

Advances in User Participation

In today's homes and schools, children are emerging as frequent and
experienced users of technology [3, 14]. As this trend continues, it becomes
increasingly important to ask if we are fulfilling the technology needs of our
children. To answer this question, I have developed a research approach that
enables young children to have a voice throughout the technology development
process. In this paper, the techniques of cooperative inquiry will be
described along with a theoretical framework that situates this work in the HCI
literature. Two examples of technology resulting from this approach will be
presented, along with a brief discussion on the design-centered learning of
team researchers using cooperative inquiry.

As a part of a European Union sponsored project, we have proposed a system
which aggregates people's expressions over a widening network of public
electronic displays in a massive Dutch housing development. Reflecting ideas
from contemporary arts as well as from research on media spaces, this is an
example of a conceptual design intended to produce meaningful effects on a
local culture. In this paper, we describe the methods and ideas that led to
this proposal, as an example of research on technologies from the traditions of
artist-designers.

Qualitative user-centered design processes such as contextual inquiry can
generate huge amounts of data to be organized, analyzed, and represented. When
you add the goal of spreading the resultant understanding to the far reaches of
a large, multi-site organization, many practical barriers emerge.
In this paper we describe experience creating and communicating
representations of contextually derived user data in a large, multi-site
product development organization. We describe how we involved a distributed
team in data collection and analysis and how we made the data representations
portable. We then describe how we have engaged over 200 people from five sites
in thinking through the user data and its implications on product design.