Evaluation at Design (InfoVis): How do we learn from users at the design
stage to correct mistakes before building a full prototype?

Involving analysts in visualisation design has obvious benefits, but the
knowledge-gap between domain experts ('analysts') and visualisation designers
('designers') often makes the degree of their involvement fall short of that
aspired. By promoting a culture of mutual learning, understanding and
contribution between both analysts and designers from the outset, participants
can be raised to a level at which all can usefully contribute to both
requirement definition and design. We describe the process we use to do this
for tightly-scoped and short design exercises -- with meetings/workshops,
iterative bursts of design/prototyping over relatively short periods of time,
and workplace-based evaluation -- illustrating this with examples of our own
experience from recent work with bird ecologists.

An integrated approach for evaluating the visualization of intensional and
extensional levels of ontologies

Visualization of ontologies is based on effective graphical representations
and interaction techniques that support users tasks related to different
entities and aspects. Ontologies can be very large and complex due to the
levels of classes hierarchy as well as the attributes. In this work we present
a study involving the investigation and proposition of guidelines to help the
design and evaluation of visualization techniques for both the intensional and
extensional levels of ontologies.

Evaluation at Design (SciVis): How do we learn from users at the design
stage to correct mistakes before building a full prototype?

Which visualizations work, for what purpose, for whom?: evaluating
visualizations of terrestrial and aquatic systems

A need for better ecology visualization tools is well documented, and
development of these is underway, including our own NSF funded Visualization of
Terrestrial and Aquatic Systems (VISTAS) project, now beginning its second of
four years. VISTAS' goal is not only to devise visualizations that help
ecologists in research and in communicating that research, but also to evaluate
the visualizations and software. Thus, we ask "which visualizations work, for
what purpose, and for which audiences," and our project involves equal
participation of ecologists, computer scientists, and social scientists. We
have begun to study visualization use by ecologists, assessed some existing
software products, and implemented a prototype. This position paper reports how
we apply social science methods in establishing context for VISTAS' evaluation
and development. We describe our initial surveys of ecologists and ecology
journals to determine current visualization use, outline our visualization
evaluation strategies, and in conclusion pose questions critical to the
evaluation, deployment, and adoption of VISTAS and VISTAS-like visualizations
and software.

In this position paper we discuss successes and limitations of current
evaluation strategies for scientific visualizations and argue for embracing a
mixed methods strategy of evaluation. The most novel contribution of the
approach that we advocate is a new emphasis on employing design processes as
practiced in related fields (e.g., graphic design, illustration, architecture)
as a formalized mode of evaluation for data visualizations. To motivate this
position we describe a series of recent evaluations of scientific visualization
interfaces and computer graphics strategies conducted within our research
group. Complementing these more traditional evaluations our visualization
research group also regularly employs sketching, critique, and other design
methods that have been formalized over years of practice in design fields. Our
experience has convinced us that these activities are invaluable, often
providing much more detailed evaluative feedback about our visualization
systems than that obtained via more traditional user studies and the like. We
believe that if design-based evaluation methodologies (e.g., ideation,
sketching, critique) can be taught and embraced within the visualization
community then these may become one of the most effective future strategies for
both formative and summative evaluations.

Cognition and evaluation (new metrics / measures): How can we measure user
cognition?

In this position paper, we discuss the problems and advantages of using
physiological measurements to estimate cognitive load in order to evaluate
scientific visualization methods. We will present various techniques and
technologies designed to measure cognitive load and how they may be leveraged
in the context of user evaluation studies for scientific visualization. We also
discuss the challenges of experiments designed to use these physiological
measurements.

Towards a 3-dimensional model of individual cognitive differences: position
paper

The effects of individual differences on user interaction is a topic that
has been explored for the last 25 years in HCI. Recently, the importance of
this subject has been carried into the field of information visualization and
consequently, there has been a wide range of research conducted in this area.
However, there has been no consensus on which evaluation methods best answer
the unique needs of information visualization. In this position paper we
propose that individual differences are evaluated in three dominant dimensions:
cognitive traits, cognitive states and experience/bias. We believe that this is
a first step in systematically evaluating the effects of users' individual
differences on information visualization and visual analytics.

With the growing need for visualization to aid users in understanding large,
complex datasets, the ability for users to interact and explore these datasets
is critical. As visual analytic systems have advanced to leverage powerful
computational models and data analytics capabilities, the modes by which users
engage and interact with the information are limited. Often, users are taxed
with directly manipulating parameters of these models through traditional GUIs
(e.g., using sliders to directly manipulate the value of a parameter). However,
the purpose of user interaction in visual analytic systems is to enable visual
data exploration -- where users can focus on their task, as opposed to the tool
or system. As a result, users can engage freely in data exploration and
decision-making, for the purpose of gaining insight. In this position paper, we
discuss how evaluating visual analytic systems can be approached through user
interaction analysis, where the goal is to minimize the cognitive translation
between the visual metaphor and the mode of interaction (i.e., reducing the
"interaction junk"). We motivate this concept through a discussion of
traditional GUIs used in visual analytics for direct manipulation of model
parameters, and the importance of designing interactions the support visual
data exploration.

Evaluating visualizations: How can we measure visualization?

A data set can be represented in any number of ways. For example,
hierarchical data can be presented as a radial node-link diagram, dendrogram,
force-directed layout, or tree map. Alternatively, point-observations can be
shown with scatter-plots, parallel coordinates, or bar charts. Each technique
has different capabilities for representing relationships. These capabilities
are further modified by projection and presentation decisions within the
technique category. Evaluating the many options is an essential task in
visualization development. Currently, evaluation is largely based on
heuristics, prior experience, and indefinable aesthetic considerations. This
paper presents initial work towards an evaluation technique based in spatial
autocorrelation. We find that spatial autocorrelation can be used to construct
a separator between visualizations and other image types. Furthermore, this can
be done with parameters amenable to interactive use and in a fashion that does
not need to take plot schema characteristics as parameters.

Visualization research focuses either on the transformation steps necessary
to create a visualization from data, or on the perception of structures after
they have been shown on the screen. We argue that an end-to-end approach is
necessary that tracks the data all the way through the required steps, and
provides ways of measuring the impact of any of the transformations. By feeding
that information back into the pipeline, visualization systems will be able to
adapt the display to the data being shown, the parameters of the output device,
and even the user.

Why evaluate?: What are the goals and motivations of evaluations? How should
these be conveyed in reporting evaluation?

My position is that improving evaluation for visualization requires more
than developing more sophisticated evaluation methods. It also requires
improving the efficacy of evaluations, which involves issues such as how
evaluations are applied, reported, and assessed. Considering the motivations
for evaluation in visualization offers a way to explore these issues, but it
requires us to develop a vocabulary for discussion. This paper proposes some
initial terminology for discussing the motivations of evaluation. Specifically,
the scales of actionability and persuasiveness can provide a framework for
understanding the motivations of evaluation, and how these relate to the
interests of various stakeholders in visualizations. It can help keep issues
such as audience, reporting and assessment in focus as evaluation expands to
new methods.

We propose an extension to the four-level nested model of design and
validation of visualization system that defines the term "guidelines" in terms
of blocks at each level. Blocks are the outcomes of the design process at a
specific level, and guidelines discuss relationships between these blocks.
Within-level guidelines provide comparisons for blocks within the same level,
while between-level guidelines provide mappings between adjacent levels of
design. These guidelines help a designer choose which abstractions, techniques,
and algorithms are reasonable to combine when building a visualization system.
This definition of guideline allows analysis of how the validation efforts in
different kinds of papers typically lead to different kinds of guidelines.
Analysis through the lens of blocks and guidelines also led us to identify four
major needs: a definition of the meaning of block at the problem level;
mid-level task taxonomies to fill in the blocks at the abstraction level;
refinement of the model itself at the abstraction level; and a more complete
set of mappings up from the algorithm level to the technique level. These gaps
in visualization knowledge present rich opportunities for future work.

New evaluation framework: What can we learn from patterns and templates and
apply to visualization evaluation?

We propose a patterns-based approach to evaluating data visualization: a set
of general and reusable solutions to commonly occurring problems in evaluating
tools, techniques, and systems for visual sensemaking. Patterns have had
significant impact in a wide array of disciplines, particularly software
engineering, and we believe that they provide a powerful lens for looking at
visualization evaluation by offering practical, tried-and-tested tips and
tricks that can be adopted immediately. The 12 patterns presented here have
also been added to a freely editable Wiki repository. The motivation for
creating this evaluation pattern language is to (a) disseminate hard-won
experience on visualization evaluation to researchers and practitioners alike;
to (b) provide a standardized vocabulary for designing visualization
evaluation; and to (c) invite the community to add new evaluation patterns to a
growing repository of patterns.

We describe the evolution of the IEEE Visual Analytics Science and
Technology (VAST) Challenge from its origin in 2006 to present (2012). The VAST
Challenge has provided an opportunity for visual analytics researchers to test
their innovative thoughts on approaching problems in a wide range of subject
domains against realistic datasets and problem scenarios. Over time, the
Challenge has changed to correspond to the needs of researchers and users. We
describe those changes and the impacts they have had on topics selected, data
and questions offered, submissions received, and the Challenge format.

Improving existing methods

Crowdsourcing-based user studies have become increasingly popular in
information visualization (InfoVis) and visual analytics (VA). However, it is
still unclear how to deal with some undesired crowdsourcing workers, especially
those who submit random responses simply to gain wages (random clickers,
henceforth). In order to mitigate the impacts of random clickers, several
studies simply exclude outliers, but this approach has a potential risk of
losing data from participants whose performances are extreme even though they
participated faithfully. In this paper, we evaluated the degree of randomness
in responses from a crowdsourcing worker to infer whether the worker is a
random clicker. Thus, we could reliably filter out random clickers and found
that resulting data from crowdsourcing-based user studies were comparable with
those of a controlled lab study. We also tested three representative reward
schemes (piece-rate, quota, and punishment schemes) with four different levels
of compensations ($0.00, $0.20, $1.00, and $4.00) on a crowdsourcing platform
with a total of 1,500 crowdsourcing workers to investigate the influences that
different payment conditions have on the number of random clickers. The results
show that higher compensations decrease the proportion of random clickers, but
such increase in participation quality cannot justify the associated additional
costs. A detailed discussion on how to optimize the payment scheme and amount
to obtain high-quality data economically is provided.

The position taken in this paper is that the availability of standardized
questionnaires specifically developed for measuring users' perception of
usability in evaluation studies in information visualization would provide the
community with an excellent additional instrument. The need for such an
instrument is evident for several important reasons. Pursuing the development,
validation and use of questionnaires will add significantly to the evidence
base necessary for the community to guide the production of high-quality
visualization techniques, facilitate adoption by users, promote successful
commercialization and guide future research tasks.

Methodologies for the analysis of usage patterns in information
visualization

In this position paper, we describe two methods for the analysis of
sequences of interaction with information visualization tools -- log file
analysis and thinking aloud. Such an analysis is valuable because it can help
designers to understand cognition processes of the users and, as a consequence,
to improve the design of information visualizations. In this context, we also
discuss the issue of categorization of user activities. Categorization helps
researchers to generalize results and compare different information
visualization tools.