Abstract

This paper provides a suite of datasets from standard multivariate
distributions and simple high-dimensional geomtric shapes that can be
used to familiarize new users of grand tour visualizations. It contains
Quicktime and Gif animations of 1-D, 2-D, 3-D, 4-D and 5-D grand tours,
links to starting XGobi or XLispStat on the calibration data sets, and C
code for generating a grand tour.

The purpose of the paper is two-fold: providing code for
the grand tour that others could pick up and modify (it is not easy to
code this version which is why there are very few implementations
currently available), and secondly, provide a variety of training
datasets to help new users get a visual sense for high-dimensional
data.

The grand tour is a method for viewing multivariate data "from all
sides". As originally proposed by Asimov (1985) it is a movie of data
projections, where the viewer is shown a continuous sequence of
d-dimensional projections of the p-dimensional data.
The dimension of the projection can be 1, 2, 3, ... , p.
Currently there are implementations of grand tours available in XGobi
(Swayne, Cook and Buja, 1997), XLispStat (Tierney, 1991)and ExplorN
(Carr, Wegman and Luo, 1996).

Grand tour examples

Here are some examples of a grand tour running on a small seven dimensional
dataset. This is the primeval form of the grand tour, a la Asimov
(1985). They are purely movies with fixed play speed and no user
interaction. Gif animations of points at the corners of a nine dimensional
cube are available through the links if you are viewing this on a platform that
doesnt support quicktime.

A Note: The animated gifs run through the
grand tour sequence once. They should show smooth changes to the image
as the animation runs, but it may appear jerky and non-smooth over the
net. To re-run it you need to reload. The quicktime movies used through
out this paper allow better control of each animation.

These examples illustrate tours implemented using the algorithm in
Buja, Cook, Asimov, Hurley (1997). They are geodesic tours that
contain no "within-projection-plane" spin, which is optimal for
viewing tours where d is less than p . This is
the type of tour implemented in XGobi
, with the main difference being that XGobi is capable of 2-D
projections only.

If you have your web browser set
up to recognize quicktime movies then you can simply click the
animation image to start downloading and viewing the moives.

If you have your web browser set
up to recognize files with a .xgobi extension then you can simply
click the XGobi button beside the data explanations
below. (You'll need the latest version of XGobi, at least the Oct 1997
beta release for this to work correctly.)

If you have your web browser
set up to recognize files with a .xli extension as XLispStat,
then you can simply click the XLispStat button beside the
data explanations below. This will start up a tour in XLispStat on the
dataset.

Samples from a normal with different
variances, but no correlation also look mostly elliptical but you see
a shrinking-expanding effect in a tour that results from variables
with small variables being toured in and then out again.

Note: Variables need to be scaled together (min/max over all
measurements is used) in the viewing transformation so that variance
difference are reflected. In XGobi, this is achieved by creating a
file with the extension .vgroups with each row having a 1 in the the
first place and nothing else on the line. The number of rows should
match the number of variables. To maintain the scale differences in
the latter two datasets we have used a trick: two points are added to
the top of the data files which delimit the min/max values of the
variables with the largest variances. These appear as two anomalous
data points floating far from other points in the grand tour, visually
distracting but they work to force XLispStat, and XGobi initiated from
the web browser, to keep the variable scales relevant to each other.

Samples from Long-Tailed Distributions

5-D Standard Cauchy

Samples from a standard Cauchy distribution in any dimension look
like a mass of points in one location and a few very extreme
points. If you remove the extreme points and rescale it still
looks like mass of points in one location and a few very extreme
points

Samples from a standard Exponential distribution (lambda=1) in any
dimension have most projections that exhibit skewness. In the
pairwise plot the points mass at the (0,0) location in each
plot. The grand tour views are more interesting: (1) it is clear
that there is one point in 5-D that is a vertex where 5 edges
merge, (2) in many projections (when all variables contribute to
the projection in an averaging manner) the data look somewhat
like a sample from a normal distribution.

Acknowledgements

This work began with the writing of code to run a grand tour with
arbitrary dimensional projections for use in the C2 Virtual Reality
Lab at Iowa State University. It is possible as a result of the work
in Buja, Cook, Asimov and Hurley (1997) which describes the
algorithm. The work here can be viewed as an adjunct to that paper.

Thanks to Dr Sigbert Klinke for valuable feedback on the
material in this paper.

The author was supported by National Science Foundation grants
DMS9632662 and DMS9214497.