Ensemble recognition in folk song recordings

Abstract

More and more researchers are starting to explore the field of automatic recognition of musical instruments within audio recordings, but so far their presented solutions cannot compete with the human ability of instrument recognition. This is especially true for polyphonic recordings. Algorithms participating in the MIREX competition usually achieve 70 to 75 percent recognition accuracy. In my thesis I am presenting automatic recognition of musical instrument groups, which is very similar to instrument recognition. The problem was simplified by limiting the recordings to Slovene folk music. Audio recordings are first divided into 10 second segments. For each of the segments nine audio features are calculated: MFCC, tempo, frequency of note onsets, zero-crossing rate, spectral roll-off, sound brightness, spectral irregularity, spectral centroid and spectral flatness. MIRToolbox (a MATLAB plug-in) was used for feature extraction in which all of the most commonly used algorithms already implemented. A machine learning algorithm LMT, implemented in Weka, is then used on these features to classify audio segments into five classes (solo accordion, Bela krajina, Prekmurje, Resian music and Resian singing). Results obtained by this method were good. 10-fold cross-validation used to test training data correctly classified 94% of recordings. For the next test I used recordings that belonged to one of the five classes. Classification accuracy achieved this way was 83%. In the last part, unedited field recordings were used, where 86% of segments were correctly classified. To conclude I also suggested a few possible improvements to the algorithm which could increase its accuracy and robustness.