Reflections on statistics as a catalyst for engineering and scientific discoveries

Tuesday, December 7, 2010

Music and Data Visualization (What It's All About)

A couple of weeks ago, mashup artist Girl Talk (Gregg Gillis) released his new album, All Day. If you are not familiar with Girl Talk's music, each track is a mashup of samples from different artists. How do you visualize one of these musical mashups? Benjamin Rahn has come up with an effective and clever way of doing it. If you love music and data visualization check out what Benjamin has done: All Day by Girl Talk - Mashup Breakdown (be aware that the lyrics of some of the sampled tracks are explicit).

Amazing, isn't? You can not only hear the music but also visualize, in real time, which samples are playing within the mashup. This "little side project", as Benjamin put it on his twitter feed, takes visualization to another dimension. Hats off to Benjamin! This mashup breakdown got me thinking about getting a copy of the data, and about the possible visualizations I could do with JMP. Fortunately, the data for ALL Day, as well as the data for Girl Talk's previous album, Feed the Animals, is available from Wikipedia (huge thanks to all the people who have contributed to these pages).

Feed the Animals was released in June 19, 2008, and there have been a few creative visualizations of the music and information in the mashups. Angela Watercutter (Wired 16.09) deconstructed track #4, What It's All About (you can listen to Feed the Animals tracks here), displaying in a circular graph the 35 samples names, their duration, and the pictures of the artists that are part of this 4:14 minutes track. Although it is a nice visual of all the artists and songs, it is not easy to see how the sampled tracks line up within the mashup.

Bunny Greenhouse (Chris Beckman) has created video mashups for each of the tracks in Feed the Animals. Here is the one for What It's All About. You get flashes of Beyoncé at the beginning of the video, and around 0:21 seconds you see Busta Rhymes with The Police's "Every Little Thing She Does Is Magic" in the background. By 0:43 seconds The Police drummer Stewart Copeland appears playing his unmistakable driving drum beat, followed by Sting, Andy and Stewart dancing. At 3:33 we see a very young Michael Jackson singing "ABC", blending, around 4 minutes, with Queen's "Bohemian Rhapsody". Now you got me, I can enjoy the music, see which artists are part of the mashup, and when their tracks appear.

In order to contribute to this collection of visualizations I decided to use the Feed the Animals data and concentrate on track 4, What It's All About. For each of the 14 tracks in Feed the Animals, what does the distribution of sampled tracks lengths looks like, are they similar to each other? Did Girl Talk used mostly short samples? How long are the longest samples? The chart below shows the lengths of sampled tracks, as jittered points, for each of the 14 tracks, along with a boxplot to get a sense for the distribution. I've added a red line to show the overall median (23 seconds) of all the 329 sampled tracks. The distribution of each of the 14 tracks is skewed to the right, and about 2/3 of the samples are 0:30 seconds or less. The color of the points show how many times a sample of a given length was used (1 to 4). For example, in "Like This" (track #7), he used four 0:01 seconds and four 0:16 seconds sampled tracks, Note that most of the red points are in tracks 6, 7 and 8. The outlier for Track 1, "Gimme Some Lovin'" (Spencer Davis Group), shows that Girl Talk favored this sampled track by giving it 2:11 minutes out of the total 4:45 minutes. What It's All About (Track #4) also has a long sample (Busta Rhymes' Woo Hah!! Got You All in Check) lasting 1:15 minutes, or about 30% of the total track.

A nice visualization tool from the genomics world, the cell plot, gives another perspective to the density of sample lengths within a track. A cell plot is a visual representation of a data table, with each cell in the plot representing a data point. The cell plot for What It's All About shows 35 cells, one for each sampled track, with a color shade, from white (short) to dark blue (long), denoting the length of the track. What It's All About kicks in with sampled tracks of lengths between 10 and 20 seconds, followed by the longest track (Woo Hah!! Got You All in Check), the darkest blue cell. Starting with "Every Little Thing She Does Is Magic" (remember second 21 in Bunny Greenhouse's video mashup?), there is a sequence of 6 sampled tracks with lengths between 30 and 55 seconds, the exception (white cell) being "Memory Band" with only 3 seconds. Towards the end we see a sequence of very short sampled tracks, the almost white strip between "What Up Gangsta?" and "Ms. Jackson", followed by the last 4 sampled tracks with lengths around 30 seconds.

What is missing in these plots is the time dimension. One of the nice things about Bejamin's visualization is that one can see where the sampled tracks fall in the overall time sequence of the track, and with respect to each other. We can use the Graph Builder in JMP to create a plot for What It's All About, with sampled track length in the x-axis, the song name in the y-axis, and the start and stop times in a stock-style bar chart. Now it is easier to see that the first 4 sampled tracks have similar lengths and that they occur around the same time. The longest sampled track, "Woo Hah!! Got You All in Check", starts around 0:15 seconds together with "Every Little Thing She Does Is Magic" but it lasts almost twice as long. In the middle of the track there is another long sampled track, "Go!", lasting about 0:65 seconds. We also see the run of very short sampled tracks towards the end of the track, as we saw in the cell plot. The track ends with about 20 seconds (3:53 to 4:14) of Queen's "Bohemian Rhapsody".

The previous plot is an improvement but music happens over time, dynamically. In order to show the dynamic dimension of time, we can use a bubble plot with bubble trails as I illustrated in my previous post, Visualizing Change with Bubble Plots. Since this visualization does not include the music, I decided to speed things up so you don't have to watch it for the 4:14 minutes that What It's All About lasts. Ready? Hit play.

Now you can see bubbles appearing and disappearing in the order they show up in the track. At 0:21 seconds you can see the red and cyan bubbles corresponding to "Woo Hah!! Got You All in Check" and "Every Little Thing She Does Is Magic". At 0:40 The Cure's "Close to Me" comes in, and at 1:04 minutes, when "Every Little Thing She Does Is Magic" drops out, two more sampled track appear: "Here Comes the Hotstepper" and "Land of a Thousand Dances". We can also see the 6 very short sampled tracks in previous plots starting at 3:17 minutes. "Bohemian Rhapsody" enters at 3:53 minutes, riding the last 0:21 seconds of the track. Who knows? Maybe for the next version of JMP we'll be able to add sound to the bubble plot mix.

No comments:

Post a Comment

Brenda & José

We are industrial statisticians each with more than fifteen years of experience working closely with engineers and scientists to help them “make sense of data”. We view statistics, combined with a powerful visualization software, as a catalyst for discoveries and insights that help bring new products to market, sustain manufacturing operations, and guide process improvements. We are avid users of JMP and SAS.