Anita Lillie's Masters Thesis at the MIT Media Lab

Well, the first passes at visualizing my test music libraries are promising. But it’s really hard to test when I can’t play the tracks from within the visualization (“Why is that track all the way out there?” ). So I’m just going to jump in and start building the UI, including the 3-D interface.

I am thinking that I will use Java 3D, with the special bonus of it already having spatial sound support. I’m imagining an interface somewhat like this virtual solar system, but with songs/playlists instead of planets/stars.

[Paul, you used Java 3D (well, jplot3d) for SITM, right? Do you think this is a good way for me to proceed? I know you’ve given me some code, but I think it will be faster for me to just build my own at this point, especially since the interface will be really different from SITM anyway.]

Any advice or cautions about Java 3D, or 3D interfaces in general, is most welcome.

Ok, this is pretty ugly… but I’m very excited to see it. It is the first test image coming out of the PCA engine I’ve built. It’s mapping test tracks based on timbre values, number of sections, duration, and time signature. JFreeChart was a great recommendation, thanks Paul :)

(The “null”s are Bach.)

I’ll use this kind of interface as I clean up the code and test out new features. Then, on to 3-D!

I’ve got a working PCA engine, and a few simple features implemented. I also have acquired a large set of MP3s with accompanying Echo Nest data, to create test libraries. So now I have code that takes a small test library, computes/retrieves a handful of audio features for that library, and remaps the tracks onto a smaller-dimensional space.

Now I’m working on a way to show the PCA results visually, since it’s much harder to test without that. I’m also developing new Feature classes focusing on general timbral descriptions of a track.

I’m getting more into my coding, and now am trying to answer the question, “What is a feature?” Specifically, what are the features that I can glean from the audio to get a meaningful distinction between one song and the next, and what is the general description of this thing I call a “feature”.

My project is not intended to be focused on figuring out or implementing these features; I am focused on bringing them together into a navigable representation. But in the design of the Feature class I’m finding myself wondering:

Do features operate at the song level, or at the section level? I should have the ability to deal with either type, but then, I am sometimes mapping song sections, and sometimes whole songs. What do I show to the user in the interface if I’m only really mapping a section of a song?

Should I try to choose one section to be representative of the piece as a whole, and just do my analysis on that section?

What are the must-have features people have already written code for, that can be easily adapted and plugged in to my engine?

What kind of rhythm-based features can I pull out? (I mention this because I am sorely lacking in the rhythm arena.)

I will start with features like this (for each track):

number of sections

number of types of sections (counted by timbre type)

number of types of sections (counted by pitch pattern type)

mode of timbres

mean level of specific timbre coefficients (coefficients shown visually at the bottom of this page)