Description

The text of this section was copied from the 2012 Wiki. Please add your comments and discussion at the bottom of this page.

The aim of the automatic beat tracking task is to track each beat locations in a collection of sound files. Unlike the Audio Tempo Extraction task, which aim is to detect tempi for each file, the beat tracking task aims at detecting all beat locations in recordings. The algorithms will be evaluated in terms of their accuracy in predicting beat locations annotated by a group of listeners.

Data

Collections

The original 2006 dataset contains 160 30-second excerpts (WAV format) used for the Audio Tempo and Beat contests in 2006. Beat locations have been annotated in each excerpt by 40 different listeners (39 listeners for a few excerpts. The length of each excerpt is 30 seconds. These audio recordings were selected to provide a stable tempo value, a wide distribution of tempi values, and a large variety of instrumentation and musical styles. About 20% of the files contain non-binary meters, and a small number of examples contain changing meters. One disadvantage of using this set for beat tracking is that the tempi are rather stable and this set will not test beat-tracking algorithms in their ability to track tempo changes.

The second collection is comprised of 367 Chopin Mazurkas, represented as full audio tracks (WAV format). The Mazurka dataset contains tempo changes so it will evaluate the ability of algorithms to track these.

The third collection was assembled and donated in 2012. This dataset contains 217 excerpts around 40s each, of which 19 are "easy" and the remaining 198 are "hard". The harder excerpts were drawn from the following musical styles: Romantic music, ﬁlm soundtracks, blues, chanson and solo guitar.

This dataset has been designed for radically new techniques which can contend with challenging beat tracking situations like: quiet accompaniment, expressive timing, changes in time signature, slow tempo, poor sound quality etc. So, if your beat tracker likes a 4/4 time-signature with a steady tempo and needs clear percussive onsets, don't expect it to do very well!
But don't be deterred, this is for the good of beat tracking.

Audio Formats

The data are monophonic sound files, with the associated onset times and data about the annotation robustness.

CD-quality (PCM, 16-bit, 44100 Hz)

single channel (mono)

file length between 2 and 36 seconds (total time: 14 minutes)

Submission Format

Submissions to this task will have to conform to a specified format detailed below. Submissions should be packaged and contain at least two files: The algorithm itself and a README containing contact information and detailing, in full, the use of the algorithm.

Input Data

Participating algorithms will have to read audio in the following format:

Sample rate: 44.1 KHz

Sample size: 16 bit

Number of channels: 1 (mono)

Encoding: WAV

Output Data

The beat tracking algorithms will return beat-times in an ASCII text file for each input .wav audio file. The specification of this output file is immediately below.

Output File Format (Audio Beat tracking)

The Beat Tracking output file format is an ASCII text format. Each beat time is specified, in seconds, on its own line. Specifically,

<beat time(in seconds)>\n

where \n denotes the end of line. The < and > characters are not included. An example output file would look something like:

0.243
0.486
0.729

Algorithm Calling Format

The submitted algorithm must take as arguments a SINGLE .wav file to perform the onset detection on as well as the full output path and filename of the output file. The ability to specify the output path and file name is essential. Denoting the input .wav file path and name as %input and the output file path and name as %output, a program called foobar could be called from the command-line as follows:

foobar %input %output
foobar -i %input -o %output

Moreover, if your submission takes additional parameters, such as a detection threshold, foobar could be called like:

foobar .1 %input %output
foobar -param1 .1 -i %input -o %output

If your submission is in MATLAB, it should be submitted as a function. Once again, the function must contain String inputs for the full path and names of the input and output files. Parameters could also be specified as input arguments of the function. For example:

foobar('%input','%output')
foobar(.1,'%input','%output')

README File

A README file accompanying each submission should contain explicit instructions on how to to run the program (as well as contact information, etc.). In particular, each command line to run should be specified, using %input for the input sound file and %output for the resulting text file.

For instance, to test the program foobar with different values for parameters param1, the README file would look like:

The different command lines to evaluate the performance of each parameter set over the whole database will be generated automatically from each line in the README file containing both '%input' and '%output' strings.

Evaluation Procedures

The evaluation methods are taken from the beat evaluation toolbox and
are described in the following technical report: