Abstract

The goal of SALSA is to bridge the semantic gap in music information research (MIR) by using adaptive
and structured signal representations. The semantic gap is the difference in information content between
signal representations or models used in MIR and high-level semantic descriptions used by musicians and
audiences. Examples are the mapping from signal representation to concrete content such as
instrumentation or to more abstract tags such as the emotional experience of music.

Recently developed methods from applied harmonic analysis allow going beyond the prevalent application
of standard time-frequency analysis in MIR by using signal representations which adapt to the inherent
characteristics of musical signals. Thereby it will be possible to obtain sparse representations in dictionaries
of basic building blocks. The sparsity paradigm will, however, be complemented by assumptions on the
representation coefficients incorporating knowledge about the structures specific to the music signals under
consideration.

The central questions of SALSA are (i) whether adaptive signal representations and structured sparse
coefficient estimation lead to improvement of learned mappings to high level semantic concepts and (ii)
how the high level descriptions can guide the adaptation step in harmonic analysis. Answering these
questions will allow for an innovative form of musical signal analysis that is informed by and adapts to the
rich semantic content music has for human listeners.