A Measure of Smoothness in Synthesized Speech

Abstract

The articulators typically move smoothly during speech production. Therefore, speech features of natural speech are generally smooth. However, over-smoothness causes "muffleness" and, hence, reduction in ability to identify emotions/expressions/styles in synthesized speech that can affect the perception of naturalness in synthesized speech. In the literature, statistical variances of static spectral features have been used as a measure of smoothness in synthesized speech but they are not sufficient enough. This paper proposes another measure of smoothness that can be efficiently applied to evaluate the smoothness of synthesized speech. Experiments showed that the proposed measure is reliable and efficient to measure the smoothness of different kinds of synthesized speech.