Thursday, 22 May 2008

ISMIR 2008 seems extremely well organized. I've been only watching from the side line this year (for the first time since 2002 I haven't submitted a paper myself) but the bits and pieces I've seen seem great.

I like how reviews are double blind this year.

I like how authors get a chance to respond to reviews. And reviewers get a chance to reconsider their ratings after seeing the response of the authors, and what their fellow reviewers wrote.

However, and this strikes me as fascinating: I get to see the name of my fellow reviewers! (To be more accurate: I only get the see reviews and names of reviewers who reviewed the same content I did.) First I thought it was a bug in the system (I even felt the urge to instantly report it to the program chairs). But it seems like the whole system is designed around exposing the real names of the reviewers to the fellow reviewers: I was just sent an email with the email addresses of all my fellow reviewers.

I think I've never experiences such openness in any of the review processes I've been involved in. It's fascinating, but it makes me wonder if that might lead to reviewers being more reluctant to write critical remarks in the future? Especially in such a small community as the ISMIR community, one of the fellow reviewers might be a colleague of one of the authors etc. While I think it's a good idea to publish reviews and a response to those (I did so for one of my publications last year here), I don't think it's necessarily a good idea to expose the reviewers without their consent.

Wednesday, 21 May 2008

August 19th there will be a Hadoop workshop somewhere in East London which some of my colleagues are helping to organize. Details can be found here. Further announcements will follow. The speakers include Doug Cutting (works at Yahoo and leads the Hadoop project). Johan and Martin will be talking about how we use Hadoop at Last.fm. Applications at Last.fm range from counting how many users listened to an artist to learning from skipping behaviour to improve our radio streams.

Tuesday, 20 May 2008

It all started about 7 years ago when I was an intern working in Aberdeen (Scotland). Aberdeen is a rather quiet town. It's beach was a wonderful place to sit and think about my MSc thesis which I was working on at that time.

At first I was very enthusiastic about the idea of using a metaphor of islands to visualize a music collection. However, a few years and many not so perfect island of music maps later my enthusiasm had faded away and I had more or less given up trying to map the entire music world onto a map.

Recently at Last.fm we launched our playground and one response we heard a few times was that it wasn't very colorful. (All three initial demonstrations were basically simple lists of artists or tracks.) At the same time I had a dataset of users described in terms of tags loaded in Matlab. Previous results using simple clustering algorithms had shown that it was very easy to extract meaningful clusters.

Wednesday, 14 May 2008

At Last.fm we launched a public version of our playground. One of the motivations has been to be able to gather more user feedback on some of the ideas we have. In particular, to find out if ideas we have are sticky or if they are just nice ideas. Eventually, all sticky ideas will (hopefully) find their way into the main site.

I'm very excited that we now have this platform. So far we've been running most of our experiments and evaluations in the background but now we can directly interact with users and expose possibilities we have even if there is no straight forward way to integrate them into the main site.

Special thanks to Klaas Bosteels who's done some amazing work during his 3 month internship at Last.fm building up playground. Btw, if you know someone who might be a good fit for an internship at Last.fm, let me know! (Java/Hadoop, PHP, Python, and being able to work efficiently on Linux systems is more or less a requirement.)

Btw, from a MIR research perspective I believe the multi tag search is the most interesting. Primarily because it indicates how much potential tags have.

Tuesday, 13 May 2008

Having read Paul's post yesterday, I instantly applied for a beta invite. In fact, while writing this blog post I'm listening to a very pleasant stream of music from the echotron. I'm impressed! It couldn't be much easier to get started. It's very easy to quickly add a bunch of artists I like to my profile, and I even get to listen to any search results. The recommendations are pretty good, definitely a lot better than many others I've seen out there... and they are sometimes rather different from the ones I get at Last.fm (in a good way). However, despite being biased, I'd still argue that Last.fm's recommendations are better ;-)

Anyway, it seems the echotron could easily turn into more than just a site built to showcase the echo nest's APIs.

Monday, 12 May 2008

We’re looking for someone who can help us improve our recommendation engine, someone who knows how to handle large amounts of data, and knows C++ really well. You’d be joining a small, enthusiastic, and flexible team.

Crunching terrabytes, dealing with computational complexities on a daily basis, developing applications where you can hear the difference (and can measure the feedback from millions of users), working on some of the most elegant C++ code every written, designing and building the next generation of Last.fm's recommendation engine, ...

Thursday, 8 May 2008

Here's a great video demonstrating how much fun automatic accompaniment systems can be:

The video demonstrates Andrew Robertson's (C4DM) live drumming controlled automatic accompaniment system: "James Sedwards (guitar) and Jeremy Doulton (drums) improvise a rock track by recording loops of bass and guitar. This is done in Ableton Live with B-Keeper controlling the tempo so they stay in time even when speeding up."

Tuesday, 6 May 2008

Applications of audio-based similarity still seem rather rare nowadays. Once in a while a startup claims that they're using it for recommendations, but looking at their results might suggest that they are just using metadata instead (for example, see this example which Paul recently blogged about).

Anyway, here is an example where 100% pure audio-similarity is being used: FM4 soundpark. The feature launched today. For anyone who doesn't read German: soundpark is the number one place for new Austrian artists to expose themselves online. Soundpark has been around for ages. Long before Myspace or similar sites became popular. Soundpark has a devoted community and is well integrated into one of the most popular radio stations in Austria (the FM4 station which reaches out to younger demographics favoring alternative/indie music).

To each song on Soundpark users will get 3 acoustically similar songs, and there seems to be a feature that allows creating a playlist by defining a starting and end song (but I haven't found that feature yet).

Soundpark hosts about 5000 artists with each a few tracks in average. (If I'm not mistaken they got about 8800 tracks.) They are constantly growing. According to the press releases they have a number of plans to add additional features to make it easier to navigate their content and discover interesting artists. Great news! Makes me especially happy knowing that it's former colleagues who are building this.