HCIL-2009-38

Analyzing multivariate datasets requires users to understand distributions of single variables and at least the two-way
relationships between the variables. Lower-dimension projection techniques may assist users in finding interesting combinations. To
explore the 2D relationships in a systematic way, we suggest ranking such relationships according to some measure of interestingness.
This approach has been proven valuable for continuous data by Seo and Shneiderman [22]; however, metrics for categorical
data are a novel contribution. We propose CateRank a tool for analyzing categorical datasets which visualizes one-dimensional
relationships as histograms and uses reoderable matrix described by Siirtola [23] for two-dimensional relationships. CateRank implements
several metrics based on the histogram and matrix properties that enable users to discover relationships between the two
categorical variables. User controls support filtering to remove extreme and uninteresting values.