Description

SwiftTalker with 24 Voices

These are Version 4.20 (Windows)

Info:

Cepstral (www.cepstral.com) have recently released their high-quality text-to-speech (TTS) voices.
The voices have a fraction of the footprint that AT&T NaturalVoices do (roughly 30Mb for Cepstral's voices compared to around 500Mb for AT&T's), and can run on a variety of platforms.

Quality:

The quality of the voices is good, but not quite as good as NaturalVoices.
The voices sometimes sound choppy, and occasionally get pronunciation wrong.
The system also doesn't seem to account for exclamation marks or questions, sometimes causing long streams of text to sound unnatural.

This isn't to say that Cepstral's voices aren't excellent - voice 'character' comes across well, there are definite distinctions between the US and UK English voices as well as sex and age group.
Here are Three examples of the voices I reviewed:

There seems to be quite a difference between voices at times, especially between certain words.
For example, the word 'wrong' is pronounced perfectly by Emily and Lawrence andslightly strangely by Millie.

It is easy to point out problems or mis-pronunciations when looking for them,
as one does when you're reviewing a speech product.

Having said this though, if you load a text file and have your favourite Cepstral voice (Millie, in my case) read it out loud, everything seems to work very nicely.

Mis-pronounciations are lost in the flow, choppy speech seems to smooth out over sentences, and generally large portions of text can be listened to without straining your ears!

SwiftTalker:

All Cepstral voices come with a little utility called 'SwiftTalker'.
This is a simple plain text editor, augmented for Cepstral voices.

Windows' Speech Control Panel applet.

This meant that any Cepstral voice used system-wide would use the default settings only.

Conclusion:

While in terms of raw quality, Cepstral's voices do not quite match counterparts such as AT&T's NaturalVoices, their small footprint make them easily downloadable, and readily portable to mobile platforms.

Furthermore, any imperfections are often diluted when large portions text are read out by the variety of distinctive voices.

Overall, the impressive technology powering the variety of voices should make realistic text-to-speech available to anyone on a tight budget.
*********************************************************************

24 Cepstral Voices

All voices listed here have a native audio format of 16kHz, 16bit, PCM, mono,
except for "Callie," which has a native audio format of 22kHz, 16bit, PCM, mono.
The output can be reformatted at runtime to suit your needs.