Chapter 8: Statistical Sonification for Exploratory Data Analysis

by Sam Ferguson, William Martens and Densil Cabrera

Description

Exploring datasets can be important for discovering new ideas or hypotheses. Sonification research is reviewed, and methods are presented for performing exploratory data analysis. Statistical methods used for visualization are transformed (based on previous research) to the auditory domain to produce statistical sonifications, with example sonifications presented that are based on the iris dataset.

Download Chapter

Media Examples

Example S8.1: Auditory Dotplot
This is a very simple example of presenting one axis of data as a sonification. It includes each data point as a simple click mapped to time. As there was a level of quantisation in the data, there is some noise added to the data too.

media file S8.1download:SHB-S8.1 (mp3, 384k)source: created by the authors

Example S8.2: Auditory Kernel Density Plot
In this example we present the Petal Length measurements of the irises as an auditory kernel density plot. The measured values are mapped to the pitch of short FM tones, but are also mapped to the time axis (which sorts them into ascending order also).The first grouping is heard clearly separated from the other data points, and we can also make a rough guess of the number of data points we can hear. This first group happens to be the setosa species of iris, while the larger, longer group is made up of the two other iris species that overlap each other.

media file S8.2download:SHB-S8.2 (mp3, 358k)source: created by the authors

Example S8.3: Auditory Boxplot
Another way of representing a distribution is to use predefined ranges (as a boxplot does), to give an indication of the 95th to 5th percentile range, interquartile range and median value of the distribution. We do this progressively for each group so as to narrow in on the centre.

media file S8.3download:SHB-S8.3 (mp3, 184k)source: created by the authors

Example S8.4: Auditory Bivariate Scatterplot
This example is a bivariate sonification is a scatterplot of measurement values available for each of the 150 items in the iris dataset on Petal-Width and Petal-Length. The petal length parameter is mapped to the pitch of the tone, while the petal width parameter is mapped to the modulation index for the tone.

media file S8.4download:SHB-S8.4 (mp3, 303k)source: created by the authors

Example S8.5: Auditory PCA Scatterplot
This example is a bivariate sonification which plots scores for each item on the first two principal components that were found when the iris dataset was submitted to Principal Component Analysis. The two PC scores are mapped to same parameters as in the previous example, for comparison.

media file S8.5download:SHB-S8.5 (mp3, 303k)source: created by the authors

Example S8.6: 4-parameter sonification
This example plots four parameters in a single sonification. The dataset values were mapped so as to move the synthesized tones through the vowel space defined by the first two formants of the human vocal tract, as follows: The measured Sepal-Length values modulated the resonant frequency of a lower-frequency formant filter, while Sepal-Width values were mapped to control the resonant frequency of a higher-frequency formant filter. Applying this co-ordinated pair of filters to the input signals that varied in pitch and duration resulted in tones that could be heard as perceptually rich and yet not overly complex, perhaps due to their speech-like character.

media file S8.6download:SHB-S8.6 (mp3, 184k)source: created by the authors

Example S8.7: Auditory Histogram
(This is an additional example not discussed in the book chapter. On correction on an error in S8.2. it was moved to here.)
In this example we present the three groupings or irises as auditory histograms presented in succession. The measured values are mapped to the pitch of short FM tones, and presented by using a rapid random selection from the values within one of the three groups. To compare the measurements in the groups against each other we present three histograms of 1 second each, separated by a short silence.

media file S8.7download:SHB-S8.7 (mp3, 482k)source: created by the authors