AES E-Library

AES E-Library

Harmonic Representation and Auditory Model-Based Parametric Matching and Its Application in Speech/Audio Analysis

The paper presents new methods for the selection of sinusoids and transients components in hybrid sinusoidal modeling of speech/audio. The instantaneous harmonic parameters (magnitude, frequency and phase) are calculated as the result of the narrow band filtering of speech/audio. The frequency-modulated filters synthesis with the closed form impulse response has been proposed. The filter frequency bounds can be determined during the components frequency tracking and can be adjusted according to the fundamental frequency modulations. It can be implemented speech/audio harmonic/noise decomposition. The transient components modeling are presented by matching pursuit with frame-based psychoacoustic optimized wavelet packet dictionary. The choice of most relevant coefficients is based on maximizing the matching between the auditory excitation scalograms of original and modeled signals.