| ||||||||||||||||||||
| ||||||||||||||||||||
|
||||||||||||||||||||
|
Cepstral (www.cepstral.com) have recently released their high-quality text-to-speech (TTS) voices. The voices have a fraction of the footprint that AT&T NaturalVoices do (roughly 30Mb for Cepstral's voices compared to around 500Mb for AT&T's), and can run on a variety of platforms. QualityThe quality of the voices is good, but not quite as good as NaturalVoices. The voices sometimes sound choppy, and occassionally get pronounciation wrong. The system also doesn't seem to account for exclamation marks or questions, sometimes causing long streams of text to sound unnatural.This isn't to say that Cepstral's voices aren't excellent - voice 'character' comes across well, there are definite distinctions between the US and UK English voices as well as sex and age group. Here are four examples of the voices I reviewed:
It is easy to point out problems or mis-pronounciations when looking for them, as one does when you're reviewing a speech product. Having said this though, if you load a text file and have your favourite Cepstral voice (Millie, in my case) read it out loud, everything seems to work very nicely. Mis-pronounciations are lost in the flow, choppy speech seems to smooth out over sentences, and generally large portions of text can be listened to without straining your ears! SwiftTalkerAll Cepstral voices come with a little utility called 'SwiftTalker'. This is a simple plain text editor, augmented for Cepstral voices:
I was a little disappointed to find these settings were only applicable to SwiftTalker and could not be modified from Windows' Speech Control Panel applet. This meant that any Cepstral voice used system-wide would use the default settings only.
ConclusionWhile in terms of raw quality, Cepstral's voices do not quite match counterparts such as AT&T's NaturalVoices, their small footprint make them easily downloadable, and readily portable to mobile platforms. Furthermore, any imperfections are often diluted when large portions text are read out by the variety of distinctive voices.Overall, the impressive technology powering the variety of voices should make realistic text-to-speech available to anyone on a tight budget.
Submitted: 23/06/2004 Article content copyright © James Matthews, 2004.
|
|
|||||||||||||||||||
All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -