Speech processing using the empirical mode decomposition and the Hilbert transform

Abstract

Huang et al. (1998) [1] proposed a new nonlinear and non-stationary data analysis method based on the empirical mode decomposition (EMD) method, which generates a collection of intrinsic mode functions (IMFs). These IMFs have well-behaved Hilbert transforms, from which the instantaneous frequencies can be calculated. Thus, we can localize any event on the time as well as the frequency axis. As a typical nonlinear and non-stationary data, speech signals are test by the new method. The experiments that use various definitions derived from the EMD and the Hilbert Transform are performed in pitch and formant analysis. The experimental results are compared with the conventional method and some remarks are made. Furthermore, a text-independent speaker identification system also tests the marginal spectrum and the new pitch detection algorithm.