November 8th, 2008

I presented MajorMiner to dorkbot nyc this month. Thanks to Douglas for giving me the opportunity to present and thanks to everyone who came out. There are some pictures linked from the page of abstracts.

If you missed it, you can see most of what I presented on the page here. Start with the game, then the human tags, then the autotags. The autotags on the website here are on a different set of music than I showed at dorkbot and these are trained on less data because I had to split the artists into a training and test set. At dorkbot I was able to show the result of autotagging 30,000 clips from 15,000 mp3s crawled from music blogs.

Anyone who wants to play the game will have to contend with jelenathegreat, who is climbing the high scorer table and should continue to for a while now that new players will be agreeing with her.

October 27th, 2008

Last week I gave a demo at the Columbia Venture Community meetup. It was a great chance to talk to people about the majorminer search and show it off a little bit. A lot of people were interested in it and I met some cool folks. Thanks to Fadi, who was a huge help with the demo, and the CVC for putting the meetup together and giving me the opportunity to present.

If you missed it, or want a closer look, this is basically how it went: We set up the majorminer game to collect tags describing 10-second clips of songs. For example, here are the clips the players tagged saxophone. With those labels, we were able to train automatic taggers (autotaggers) to identify the same tags in new clips. When run over a database of 40,000 clips, the saxophone autotagger can identify the most saxophone-y examples. There are also other types of tags besides instruments, for example, there are descriptions like soft and there are genres like hip hop. You can see more of these examples broken down into these categories on the very basic cvc demo page I setup.

August 13th, 2008

Every time I write about MajorMiner, people ask when I’m going to make the data publicly available. Well, I’m starting to do that by building a MIREX task around it. The task is officially called the Audio Tag Classification task and you can take a look at the details on its MIREX wiki page. As emails and conversations bounced around, it became not just a classification task, but also a retrieval task. Doug Turnbull formulated it well by breaking it down into three related tasks:

Clip-Tag classification: determine whether each tag applies to each clip or not

Clip retrieval: for each tag, rank the clips by their relevance

Tag retrieval: for each clip, rank the tags by their relevance

There are only a few days left until the submission deadline, but if you want to throw something together, more submissions would be great. In case you’re wondering, the main contributors to the design of this task have been Kris West, Thierry Bertin-Mahieux, Doug Turnbull, and Greg Tsoumakas. Mert Bay is running things at IMIRSEL.

April 28th, 2008

We’ve started using the data that we collected through the MajorMiner game. We’re using it in two ways: making it searchable directly, and training autotaggers with it. The human search finds all of the clips that have had a particular tag applied to them by at least two people, sorted by the number of times it’s been applied. You can type a search directly into the search box, or browse through the top few. People are pretty good at finding things in music, as it turns out, check out british, u2, tambourine, and scratch. This search also takes advantage of the newly introduced canonicalization of tags, so that funk matches funky. But there are always ambiguity issues, e.g. club as lyric vs genre.

The machine search is a little more involved. We took all of the tags that had been applied to enough (35) clips and used them to train classifiers. Actually, we only used clips from half of the artists in our collection to train the classifiers, then we ranked all of the clips from the rest of the artists by each classifier’s output. This means we can look at all of those clips sorted by how much they appeal to the rap classifier, the saxophone classifier, the house classifier, and so on. I like how the guitar classifier catches Outkast’s acoustic guitar (!), but also the Jesus and Mary Chain’s fuzzed out guitar. For those of you interested in the details, we have a couple of papers that we’ve submitted recently describing them, but the gist is that we’re using the features from last MIREX and the usual SVM classifier.

Some thought went into the ranking of the tags on the main autotag page as well. Since we know the answers for some of the clips in the test set, and we ranked the tags by how well their classifier was able to learn them. Actually, we used a Bayesian estimate of the classification accuracy from the beta-binomial model to do the ranking more intelligently. The basic idea is that test accuracy is measured more accurately for tags with a lot of test examples, and less accurately for tags with few test examples. The measured accuracy of tags are then shrunk towards the overall mean accuracy in proportion to how well the model thinks they are estimated. So even though club has a better raw accuracy than rap, it was tested on many fewer examples, so it ends up below rap in the final ranking, i.e. the raw accuracy is more likely a random fluctuation than a meaningful result.

So go check out some of the creative ways our players have found to describe music, and describe some music yourself!

April 13th, 2008

We’ve updated MajorMiner to match tags more intelligently. Now “hip hop” will match “hip-hop”, “folky” will match “folk”, and “synth” will match “synthesizer”. You wouldn’t believe how many different ways people can spell “drum and bass” (drum n bass, drum & bass, drum ‘n bass, drum+bass, drum ‘n’ bass, etc.), but they should all match each other now. With these changes, it’s easier to score points and more fun to play.

December 20th, 2007

August 5th, 2007

We’ve just added 1333 new tracks to the database, you should notice them start to filter in to your games. This infusion will boost our levels of indie rock, hip-hop, classic rock, pop, jazz, and country.

July 4th, 2007

MajorMiner now has a blog! We’ll use it to announce new features of our game and other related sites and events. It’s also a great place for you to tell us how we’re doing, if there are any improvements you’re dying to see us make to the game, or ask us any questions you might have. Happy listening.