Abstract

Although breathy voice is typically characterized by an increase in spectralnoise, it is notoriously difficult to devise a computational method to distinguish breathy from clear (modal) voice. The present study successfully makes use of an algorithm, originally developed to quantify aspects of pathological voice quality [G. de Krom, J. Speech Hear. Res. 36, 254–266 (1993)], which computes a harmonics‐to‐noise ratio (HNR). The algorithm calculates the harmonics‐to‐noise ratio using a comb filter defined in the cepstrum domain to separate the harmonics from the noise. Performance of the algorithm was tested on three speakers (2 male, 1 female) of Javanese producing a word list of 31 minimal breathy/clear word pairs. Results showed that the algorithm reliably distinguished breathy from clear tokens for all three speakers, with higher HNRs for clear than for breathy tokens. Moreover, accurate performance was obtained for nearly all frequency ranges investigated (60–2000 Hz, 2000–3000 Hz, 3000–5000 Hz). A comparison to other methods (such as H1–H2) will also be presented. Finally, perceptual rating experiments will be conducted to determine if the algorithm’s performance correlates with perceived degree of breathiness.