Simon Dixon

Österreichisches Forschungsinstitut für Artificial Intelligence Wien

über das Thema:

New Time Warping Algorithms for Tracking and Aligning Musical Performances

Zeit: Thu 30.6.2005, 15:15, 60 Minuten
Ort: HS 13

Zusammenfassung

Music information retrieval and particularly content-based processing of audio signals are rapidly expanding research fields, with applications as diverse as music search engines, music recommendation, automatic transcription, intelligent musical accompaniment, intelligent tutoring and the analysis, visualisation and modification of performance expression. Many of these applications need to generate fine-grained metadata from musical recordings, such as the times of individual notes or phrases, although no reliable algorithm exists for achieving this goal. We present an indirect method of extracting musical content, by aligning a recording or a live stream with an annotated version of the same piece, using an algorithm based on dynamic time warping.
Dynamic time warping is a method for finding the optimal alignment of a pair of time series, but it is unsuitable for on-line applications because it requires complete knowledge of both series before the alignment of the first elements can be computed. Further, the quadratic time and space requirements are limiting factors even for off-line applications. We present a novel on-line time warping algorithm which has linear time and space costs, and performs incremental alignment of two series as one is received in real time. This algorithm is applied to the alignment of audio signals in order to track musical performances of arbitrary length. We demonstrate one application of this algorithm, analysis and visualisation of musical expression in real time, and then describe modifications of the system for off-line alignment, showing an off-line matching tool which generates precise metadata for musical recordings.
In quantitative tests on a variety of classical piano music, the on-line algorithm had a median error of 40 ms, whereas the off-line system had a median error of 20 ms. Qualitative testing on orchestral and pop music gave generally positive results, with errors being rare even when the performances were significantly different.